Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

beautifulsoup - Python WebScraping FlashScore

I am using the following code to extract the outcome of the matches on FlashScore:

from requests_html import AsyncHTMLSession
from collections import defaultdict
import pandas as pd



url = 'https://www.flashscore.com/football/netherlands/eredivisie/results/'

asession = AsyncHTMLSession()

async def get_scores():
    r = await asession.get(url)
    await r.html.arender()
    return r

results = asession.run(get_scores)
results = results[0]

times = results.html.find("div.event__time")
home_teams = results.html.find("div.event__participant.event__participant--home")
scores = results.html.find("div.event__scores.fontBold")
away_teams = results.html.find("div.event__participant.event__participant--away")
event_part = results.html.find("div.event__part")


dict_res = defaultdict(list)

for ind in range(len(times)):
    dict_res['times'].append(times[ind].text)
    dict_res['home_teams'].append(home_teams[ind].text)
    dict_res['scores'].append(scores[ind].text)
    dict_res['away_teams'].append(away_teams[ind].text)
    dict_res['event_part'].append(event_part[ind].text)

df_res = pd.DataFrame(dict_res)
print(df_res)

This results in the following out:

            times        home_teams scores  away_teams event_part
0    22.01. 20:00         Willem II  1?-?3      Zwolle    (1?-?0)
1    17.01. 16:45              Ajax  1?-?0   Feyenoord    (1?-?0)
2    17.01. 14:30         Groningen  2?-?2      Twente    (0?-?2)
3    17.01. 14:30             Venlo  1?-?1  Heerenveen    (0?-?0)
4    17.01. 12:15          Waalwijk  1?-?1   Willem II    (1?-?0)
..            ...               ...    ...         ...        ...
101  25.10. 20:00          Den Haag  2?-?2  AZ Alkmaar    (0?-?1)
102  25.10. 16:45          Waalwijk  2?-?2   Feyenoord    (0?-?0)
103  25.10. 14:30  Sparta Rotterdam  1?-?1    Heracles    (0?-?0)
104  25.10. 14:30           Vitesse  2?-?1         PSV    (1?-?0)
105  25.10. 12:15           Sittard  1?-?3   Groningen    (0?-?2)

[106 rows x 5 columns]

However, whenever going to the website https://www.flashscore.com/football/netherlands/eredivisie/results/, it shows at the bottom a 'Show more matches' button. The output shows only the first couple of matches, and not the additional information which shows up if you click on 'Show more matches'. Is it possible to also extract this additional information?

question from:https://stackoverflow.com/questions/65860397/python-webscraping-flashscore

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...