Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
381 views
in Technique[技术] by (71.8m points)

file - Is it possible to read categorical columns with pandas' read_csv?

I have tried passing the dtype parameter with read_csv as dtype={n: pandas.Categorical} but this does not work properly (the result is an Object). The manual is unclear.

question from:https://stackoverflow.com/questions/30272300/is-it-possible-to-read-categorical-columns-with-pandas-read-csv

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

In version 0.19.0 you can use parameter dtype='category' in read_csv:

data = 'col1,col2,col3
a,b,1
a,b,2
c,d,3'
df = pd.read_csv(pd.compat.StringIO(data), dtype='category')
print (df)
  col1 col2 col3
0    a    b    1
1    a    b    2
2    c    d    3

print (df.dtypes)
col1    category
col2    category
col3    category
dtype: object

If want specify column for category use dtype with dictionary:

df = pd.read_csv(pd.compat.StringIO(data), dtype={'col1':'category'})
print (df)
  col1 col2  col3
0    a    b     1
1    a    b     2
2    c    d     3

print (df.dtypes)
col1    category
col2      object
col3       int64
dtype: object

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...