I read a csv file containing 150,000 lines into a pandas dataframe. This dataframe has a field, Date
, with the dates in yyyy-mm-dd
format. I want to extract the month, day and year from it and copy into the dataframes' columns, Month
, Day
and Year
respectively. For a few hundred records the below two methods work ok, but for 150,000 records both take a ridiculously long time to execute. Is there a faster way to do this for 100,000+ records?
First method:
df = pandas.read_csv(filename)
for i in xrange(len(df)):
df.loc[i,'Day'] = int(df.loc[i,'Date'].split('-')[2])
Second method:
df = pandas.read_csv(filename)
for i in xrange(len(df)):
df.loc[i,'Day'] = datetime.strptime(df.loc[i,'Date'], '%Y-%m-%d').day
Thank you.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…