python - Parsing "NA" entries as NaN values when reading in a pandas dataframe

Question

Welcome To Ask or Share your Answers For Others

python - Parsing "NA" entries as NaN values when reading in a pandas dataframe

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Parsing "NA" entries as NaN values when reading in a pandas dataframe

i am new to pandas. I have loaded csv using pandas.read_csv. i have tried not to specify dtype but it was way too slow. since it is a very large file, i also specified data type. however, sometimes in numeric columns, it contains "NA". i have used na_values = ['NA'], will it affect my data frame? i still want to preserve these rows. my question is if i specify data type and add na_values = ['NA'], will NA be tossed away? if yes, how can i maintain similar process time without losing these na? thank you very much!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T18:25:58+0000

From the pd.read_csv docs:

na_values : scalar, str, list-like, or dict, default None

Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values. By default the following values are interpreted as NaN: ‘’, ... ‘NA’, ...`.

Bold emphasis mine. These values are not tossed away, rather, they are converted to NaN. Pandas is smart enough to automatically recognise those values without you explicitly stating it.

Categories

python - Parsing "NA" entries as NaN values when reading in a pandas dataframe

python - Parsing "NA" entries as NaN values when reading in a pandas dataframe

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags