I'm currently trying to scrape data off a website, but using the code beneath it would return an empty array " [] " for some reason. I can't seem to figure out the reasoning behind it. When I check the html generated there seems to be a lot of
. I am unsure what the issue seems to be with my code.
url = "http://www.hkex.com.hk/eng/csm/price_movement_result.htm?location=priceMoveSearch&PageNo=1&SearchMethod=2&mkt=hk&LangCode=en&StockType=ALL&Ranking=ByMC&x=51&y=6"
html = requests.get(url)
soup = BeautifulSoup(html.text,'html.parser')
rows = soup.find_all('tr')
print rows
I have attempted to parse non ".text" and also "lxml" instead of "html.parser" but ended up with the same result.
EDIT: Found the workaround, used selenium to open the page and grab the source that way instead.
url = "http://www.hkex.com.hk/eng/csm/price_movement_result.htm?location=priceMoveSearch&PageNo=1&SearchMethod=2&mkt=hk&LangCode=en&StockType=ALL&Ranking=ByMC&x=51&y=6"
driver = webdriver.Firefox()
driver.get(url)
f = driver.page_source
soup = BeautifulSoup(f,'html.parser')
rows = soup.find_all('tr')
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…