python - Modifying a subset of rows in a pandas dataframe

Question

Welcome To Ask or Share your Answers For Others

python - Modifying a subset of rows in a pandas dataframe

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Modifying a subset of rows in a pandas dataframe

Assume I have a pandas DataFrame with two columns, A and B. I'd like to modify this DataFrame (or create a copy) so that B is always NaN whenever A is 0. How would I achieve that?

I tried the following

df['A'==0]['B'] = np.nan

and

df['A'==0]['B'].values.fill(np.nan)

without success.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-16T22:33:17+0000

Use .loc for label based indexing:

df.loc[df.A==0, 'B'] = np.nan

The df.A==0 expression creates a boolean series that indexes the rows, 'B' selects the column. You can also use this to transform a subset of a column, e.g.:

df.loc[df.A==0, 'B'] = df.loc[df.A==0, 'B'] / 2

I don't know enough about pandas internals to know exactly why that works, but the basic issue is that sometimes indexing into a DataFrame returns a copy of the result, and sometimes it returns a view on the original object. According to documentation here, this behavior depends on the underlying numpy behavior. I've found that accessing everything in one operation (rather than [one][two]) is more likely to work for setting.

Categories

python - Modifying a subset of rows in a pandas dataframe

python - Modifying a subset of rows in a pandas dataframe

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags