Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
537 views
in Technique[技术] by (71.8m points)

python - Can't drop NAN with dropna in pandas

I import pandas as pd and run the code below and get the following result

Code:

traindataset = pd.read_csv('/Users/train.csv')
print traindataset.dtypes
print traindataset.shape
print traindataset.iloc[25,3]
traindataset.dropna(how='any')
print traindataset.iloc[25,3]
print traindataset.shape

Output

TripType                   int64  
VisitNumber                int64  
Weekday                   object  
Upc                      float64  
ScanCount                  int64  
DepartmentDescription     object  
FinelineNumber           float64  
dtype: object

(647054, 7)

nan  
nan

(647054, 7) 
[Finished in 2.2s]

From the result, the dropna line doesn't work because the row number doesn't change and there is still NAN in the dataframe. How that comes? I am craaaazy right now.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You need to read the documentation (emphasis added):

Return object with labels on given axis omitted

dropna returns a new DataFrame. If you want it to modify the existing DataFrame, all you have to do is read further in the documentation:

inplace : boolean, default False

If True, do operation inplace and return None.

So to modify it in place, do traindataset.dropna(how='any', inplace=True).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...