Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
232 views
in Technique[技术] by (71.8m points)

python - How to treat NULL as a normal string with pandas?

I have a csv-file with a column with strings and I want to read it with pandas. In this file the string null occurs as an actual value and should not be regarded as a missing value.

Example:

import pandas as pd
from io import StringIO

data = u'strings,numbers
foo,1
bar,2
null,3'
print(pd.read_csv(StringIO(data)))

This gives the following output:

  strings  numbers
0     foo        1
1     bar        2
2     NaN        3

What can I do to get the value null as it is (and not as NaN) into the DataFrame? The file can be assumed to not contain any actually missing values.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can specify a converters argument for the string column.

pd.read_csv(StringIO(data), converters={'strings' : str})

  strings  numbers
0     foo        1
1     bar        2
2    null        3

This will by-pass pandas' automatic parsing.


Another option is setting na_filter=False:

pd.read_csv(StringIO(data), na_filter=False)

  strings  numbers
0     foo        1
1     bar        2
2    null        3

This works for the entire DataFrame, so use with caution. I recommend first option if you want to surgically apply this to select columns instead.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...