Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
76 views
in Technique[技术] by (71.8m points)

python - Is there a better way to read many html url?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
  1. Add all your urls and the Sett. values in list of tuples.
  2. Iterate over them using for loop.

This is the code:

import pandas as pd

urls = [('https://fbref.com/it/comp/11/calendario/Risultati-e-partite-di-Serie-A', 15),
        ('https://fbref.com/it/comp/26/calendario/Risultati-e-partite-di-Super-Lig', 15),
        ('https://fbref.com/it/comp/12/calendario/Risultati-e-partite-di-La-Liga', 13),
        ('https://fbref.com/it/comp/13/calendario/Risultati-e-partite-di-Ligue-1', 13),
        ('https://fbref.com/it/comp/20/calendario/Risultati-e-partite-di-Bundesliga', 13)
        ] #Add all URLS and 'Sett.' values

frames = []

for url, sett in urls:
    df2 = pd.read_html(url)[0][['Sett.', 'Data', 'Casa', 'Punteggio', 'Ospiti']]
    frames.append(df2[(df2['Punteggio'] == '0–0') & (df2['Sett.'] > sett)]) #Here will come in handy the 'sett' variable

result = pd.concat(frames)

I didn't manually add all the URLs into the list because there were so many of them.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...