Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
111 views
in Technique[技术] by (71.8m points)

r - Is there a way to create a loop logic that can select different option for each dropdown list and repeat it again for the next option?

does anyone know how to automate web scraping on a dynamic web page using Selenium in Python or R?

Situation:
This page has a few dropdown category lists, and each category list has several options, e.g. Brand is one of a dropdown category list, and it has a lot of options to select such as ALFA ROMEO, AUDI, and so on. The same goes for other category lists as well. Furthermore, to select the option for the next category list, the previous or the option of the first category must be selected first. Once the option for all category lists has been selected, then the next phase is to click the "Get Valuation" button. It will output the information on the given table.

Objective:
The objective is to get all information from the table for all categories with all options and store it in .csv format.

Screenshot 1 (list of options in a category):
enter image description here

Screenshot 2 (output):
enter image description here

Problem:
I try to create a logic where it could click for each category and select one option at one time and then proceed to select the option on the next category. At the end of the process, it will click a submit button to get valuation information. But, my scripts only can click it for one option if the input were given. Then if I want to select the next option for all categories I have to use the same script over and over again with different input for index number(this refers to the element by id). This will be a tedious non-automatic job and can be a problem too if there is a newly updated list.

Code sample to replicate:

import time
import requests
import pandas as pd
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver .support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By

url = "https://www.carbase.my/tool/car-market-value-guide"
chrome_path = r"C:UsersNoobieDocumentsutilschromedriver.exe"
driver = webdriver.Chrome(chrome_path)
time.sleep(2)
driver.get(url)

def multi_define(driver, element_id, indexs):
    select = Select(driver.find_element_by_id(element_id))
    for index in indexs:
        select.select_by_index(index)

elem_cat = ('brand','family','year','cc','transmission','variant')

multi_define(driver, elem_cat[0], '2')
multi_define(driver, elem_cat[1], '2')
multi_define(driver, elem_cat[2], '2')
multi_define(driver, elem_cat[3], '2')
multi_define(driver, elem_cat[4], '1')
multi_define(driver, elem_cat[5], '1')
driver.find_element_by_id("loan-calculate").click()

Output: Refer to screenshot 2

Could you guys advise on how to create a script that can loop and select a different option for each dropdown list and repeat it again for the next option?

question from:https://stackoverflow.com/questions/65862149/is-there-a-way-to-create-a-loop-logic-that-can-select-different-option-for-each

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

For each select element, make the function select options one by one, then call itself recursively on the next select element.

While doing this, you can save the currently selected items in an array, and when you reach the last select item, use them however you want. Here is your code modified, that works on the URL you posted.

driver.get(url)

elem_cat = ('brand','family','year','cc','transmission','variant')

def multi_define(driver, elem_index, selected):
    select = Select(driver.find_element_by_id(elem_cat[elem_index]))
    options = select.options

    # select each option
    for i in range(1, len(options)):
        select.select_by_index(i)
        time.sleep(0.3)

        if elem_index == 5:
            # we are at the last select item
            driver.find_element_by_id("loan-calculate").click()
            selected = selected + [options[i].text]
            
            # clicked the button. selected options are in selected array.
            # do whatever you need to do with this information
            time.sleep(0.1)
            print(selected)

        elif options[i].text != "": # skip placeholders
            # recursive call for the next select. you can change [options[i].text]
            # to whatever information you need about this option
            multi_define(driver, elem_index + 1, selected + [options[i].text])

multi_define(driver, 0, [])

Output of this code:

['ALFA ROMEO', '145', '2001', '1598', '5 SP MANUAL']
['ALFA ROMEO', '145', '2000', '1598', '5 SP MANUAL']
['ALFA ROMEO', '145', '1999', '1598', '5 SP MANUAL']
['ALFA ROMEO', '145', '1998', '1598', '5 SP MANUAL']
['ALFA ROMEO', '145', '1997', '1598', '5 SP MANUAL']
['ALFA ROMEO', '145', '1996', '1598', '5 SP MANUAL']
['ALFA ROMEO', '145', '1995', '1598', '5 SP MANUAL']
['ALFA ROMEO', '145', '1995', '1598', '5 SP MANUAL']
['ALFA ROMEO', '146', '2002', '1598', '5 SP MANUAL']
...

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...