first, just want to point out that the .format() method 'boxscores3.csv'.format(start_season)
here doesnt do anything. it's still going to return 'boxscores3.csv'
. You'd need to have that placeholder within the string to have that in the filename:
so for example if start_season = '2020'
, then 'boxscores3_{0}.csv'.format(start_season)
would give you 'boxscores3_2020.csv'
So if you want that filename dynamic, change to:
box_df.to_csv('boxscores3_{0}.csv'.format(start_season),index=None)
or
box_df.to_csv('boxscores3_{some_variable}.csv'.format(some_variable = start_season),index=None)
or
box_df.to_csv('boxscores3_%s.csv' %start_season),index=None)
Next, until you can provide a sample of that csv file, specifically row 10653, can't really help you with the specific issue.
However, until then, I can offer an alternate solution using espn api.
You can get box scores of college basketball games, provided you have the gameId. So this code will go through each date (need to give a start date), get the gameIds of each game. Then with the gameIds, can go get the boxscore from another api endpoint. Unfortunetly, the boxscore isn't returned in a json format, but rather the html (which is fine because we can use pandas
to read in the table).
I don't know exactly what you need or want, but this may help you while you are learning python to see other ways to get data:
Code:
from tqdm import tqdm
import requests
import pandas as pd
import datetime
date_list = []
sdate = datetime.date(2021, 1, 1) # start date
edate = datetime.date.today() # end date
delta = edate - sdate # as timedelta
for i in range(delta.days + 1):
day = sdate + datetime.timedelta(days=i)
date_list.append(day.strftime("%Y%m%d"))
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36'}
payload = {
'xhr': '1',
'device': 'desktop',
'country': 'us',
'lang': 'en',
'region': 'us',
'site': 'espn',
'edition-host': 'espn.com',
'site-type': 'full'}
# Get gameIds
gameId_dict = {}
for dateStr in tqdm(date_list):
url = 'https://secure.espn.com/core/mens-college-basketball/schedule/_/date/{dateStr}/group/50'.format(dateStr=dateStr)
games = requests.get(url, headers=headers, params=payload).json()['content']['schedule'][dateStr]['games']
gameId_dict[dateStr] = []
for game in games:
# Check if game was postponed
if game['status']['type']['name'] in ['STATUS_POSTPONED','STATUS_CANCELED','STATUS_SCHEDULED']:
continue
game_info = {}
game_info[game['id']] = {}
game_info[game['id']]['awayTeam'] = game['shortName'].split('@')[0].strip()
game_info[game['id']]['homeTeam'] = game['shortName'].split('@')[1].strip()
gameId_dict[dateStr].append(game_info)
full_df = pd.DataFrame()
# Box score - gameId needed
box_url = 'https://secure.espn.com/core/mens-college-basketball/boxscore'
for dateStr, games in tqdm(gameId_dict.items()):
for game in tqdm(games):
for gameId, teams in game.items():
payload = {
'gameId': gameId,
'xhr': '1',
'render': 'true',
'device': 'desktop',
'country': 'us',
'lang': 'en',
'region': 'us',
'site': 'espn',
'edition-host': 'espn.com',
'site-type': 'full'}
data = requests.get(box_url, headers=headers, params=payload).json()
away_df = pd.read_html(data['content']['html'], header=1)[0].rename(columns={'Bench':'Player'})
away_df = away_df[away_df['Player'] != 'TEAM']
away_df = away_df[away_df['Player'].notna()]
away_df['Team'] = teams['awayTeam']
away_df['Home_Away'] = 'Away'
away_df['Starter_Bench'] = 'Bench'
away_df.loc[0:4, 'Starter_Bench'] = 'Starter'
away_df['Player'] = away_df['Player'].str.split(r"([a-z]+)([A-Z].+)", expand=True)[2]
away_df[['Player','Team']] = away_df['Player'].str.extract('^(.*?)([A-Z]+)$', expand=True)
home_df = pd.read_html(data['content']['html'], header=1)[1].rename(columns={'Bench':'Player'})
home_df = home_df[home_df['Player'] != 'TEAM']
home_df = home_df[home_df['Player'].notna()]
home_df['Team'] = teams['homeTeam']
home_df['Home_Away'] = 'Home'
home_df['Starter_Bench'] = 'Bench'
home_df.loc[0:4, 'Starter_Bench'] = 'Starter'
home_df['Player'] = home_df['Player'].str.split(r"([a-z]+)([A-Z].+)", expand=True)[2]
home_df[['Player','Team']] = home_df['Player'].str.extract('^(.*?)([A-Z]+)$', expand=True)
game_df = away_df.append(home_df, sort = False)
game_df['Date'] = datetime.datetime.strptime(dateStr, '%Y%m%d').strftime('%m/%d/%y')
full_df = full_df.append(game_df, sort = False)
full_df = full_df.reset_index(drop=True)
Output:
print (full_df.head(30).to_string())
Player MIN FG 3PT FT OREB DREB REB AST STL BLK TO PF PTS Team Home_Away Starter_Bench Pos Date
0 H. Drame 22 2-7 0-2 0-0 1 1 2 0 0 1 1 4 4 SPU Away Starter F 01/01/21
1 F. Drame 20 2-3 0-1 0-0 1 5 6 0 3 1 1 4 4 SPU Away Starter F 01/01/21
2 M. Lee 24 2-11 0-4 1-2 1 2 3 0 0 0 3 0 5 SPU Away Starter G 01/01/21
3 D. Banks 26 4-12 1-6 2-4 0 5 5 6 1 0 1 1 11 SPU Away Starter G 01/01/21
4 D. Edert 32 6-10 2-4 1-2 0 4 4 0 2 0 1 2 15 SPU Away Starter G 01/01/21
5 O. Diahame 1 0-1 0-0 0-0 0 0 0 0 0 0 0 0 0 SPU Away Bench F 01/01/21
6 K. Ndefo 23 7-10 0-0 3-3 1 6 7 2 1 5 1 4 17 SPU Away Bench F 01/01/21
7 B. Diallo 14 0-2 0-0 0-0 1 1 2 0 0 0 0 0 0 SPU Away Bench G 01/01/21
8 T. Brake 24 1-2 0-1 0-0 0 0 0 1 0 0 0 1 2 SPU Away Bench G 01/01/21
9 M. Silvera 6 0-0 0-0 0-0 0 1 1 1 0 0 1 0 0 SPU Away Bench G 01/01/21
10 N. Kamba 8 0-1 0-0 0-0 0 0 0 0 0 0 2 0 0 SPU Away Bench G 01/01/21
11 J. Fritz 38 5-9 0-0 4-5 2 8 10 4 1 3 1 3 14 CAN Home Starter F 01/01/21
12 J. White 17 4-7 1-2 0-0 1 4 5 2 0 0 5 2 9 CAN Home Starter F 01/01/21
13 A. Fofana 20 1-7 1-4 1-2 0 1 1 1 0 0 1 2 4 CAN Home Starter G 01/01/21
14 A. Harried 23 3-10 1-4 0-1 2 5 7 1 1 1 0 1 7 CAN Home Starter G 01/01/21
15 J. Henderson 37 3-8 3-5 5-6 0 1 1 2 0 0 1 1 14 CAN Home Starter G 01/01/21
16 G. Maslennikov 2 0-2 0-1 0-0 0 0 0 0 0 0 1 1 0 CAN Home Bench F 01/01/21
17 M. Green 18 3-4 0-0 2-2 1 4 5 2 1 0 2 1 8 CAN Home Bench F 01/01/21
18 S. Hitchon 3 0-0 0-0 0-0 0 0 0 1 0 0 0 0 0 CAN Home Bench F 01/01/21
19 S. Uijtendaal 20 2-4 1-2 0-0 0 0 0 0 1 0 0 2 5 CAN Home Bench G 01/01/21
20 M. Brandon 19 4-5 1-2 0-0 0 3 3 2 2 0 2 1 9 CAN Home Bench G 01/01/21
21 A. Ahemed 3 0-0 0-0 0-0 0 1 1 1 0 0 0 1 0 CAN Home Bench G 01/01/21
22 K. Nwandu 34 5-13 1-3 0-1 1 3 4 3 1 0 3 1 11 NIAG Away Starter F 01/01/21
23 G. Kuakumensah 23 1-2 1-2 1-2 0 2 2 1 0 0 1 1 4 NIAG Away Starter F 01/01/21
24 N. Kratholm 18 4-7 0-0 3-5 2 2 4 1 0 0 0 2 11 NIAG Away Starter F 01/01/21
25 M. Hammond 33 7-14 3-6 0-0 0 4 4 1 1 0 2 2 17 NIAG Away Starter G 01/01/21
26 J. Roberts 28 2-6 2-6 2-2 0 2 2 3 1 0 2 3 8 NIAG Away Starter G 01/01/21
27 J. Cintron 14 0-2 0-0 0-0 1 3 4 0 0 1 2 1 0 NIAG Away Bench F 01/01/21
28 DonaldN. MacDonald 9 0-1 0-1 0-0 0 3 3 0 0 0 0 0 0 NIAG Away Bench G 01/01/21
29 R. Solomon 25 4-11 0-2 2-2 1 3 4 0 3 0 0 1 10 NIAG Away Bench G 01/01/21