Why it does not work
There is no datetime dtype to be set for read_csv as csv files can only contain strings, integers and floats.
Setting a dtype to datetime will make pandas interpret the datetime as an object, meaning you will end up with a string.
Pandas way of solving this
The pandas.read_csv()
function has a keyword argument called parse_dates
Using this you can on the fly convert strings, floats or integers into datetimes using the default date_parser
(dateutil.parser.parser
)
headers = ['col1', 'col2', 'col3', 'col4']
dtypes = {'col1': 'str', 'col2': 'str', 'col3': 'str', 'col4': 'float'}
parse_dates = ['col1', 'col2']
pd.read_csv(file, sep='', header=None, names=headers, dtype=dtypes, parse_dates=parse_dates)
This will cause pandas to read col1
and col2
as strings, which they most likely are ("2016-05-05" etc.) and after having read the string, the date_parser for each column will act upon that string and give back whatever that function returns.
Defining your own date parsing function:
The pandas.read_csv()
function also has a keyword argument called date_parser
Setting this to a lambda function will make that particular function be used for the parsing of the dates.
GOTCHA WARNING
You have to give it the function, not the execution of the function, thus this is Correct
date_parser = pd.datetools.to_datetime
This is incorrect:
date_parser = pd.datetools.to_datetime()
Pandas 0.22 Update
pd.datetools.to_datetime
has been relocated to date_parser = pd.to_datetime
Thanks @stackoverYC
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…