Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
422 views
in Technique[技术] by (71.8m points)

python - Row-wise average for a subset of columns with missing values

I've got a 'DataFrame` which has occasional missing values, and looks something like this:

          Monday         Tuesday         Wednesday 
      ================================================
Mike        42             NaN               12
Jenna       NaN            NaN               15
Jon         21              4                 1

I'd like to add a new column to my data frame where I'd calculate the average across all columns for every row.

Meaning, for Mike, I'd need (df['Monday'] + df['Wednesday'])/2, but for Jenna, I'd simply use df['Wednesday amt.']/1

Does anyone know the best way to account for this variation that results from missing values and calculate the average?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can simply:

df['avg'] = df.mean(axis=1)

       Monday  Tuesday  Wednesday        avg
Mike       42      NaN         12  27.000000
Jenna     NaN      NaN         15  15.000000
Jon        21        4          1   8.666667

because .mean() ignores missing values by default: see docs.

To select a subset, you can:

df['avg'] = df[['Monday', 'Tuesday']].mean(axis=1)

       Monday  Tuesday  Wednesday   avg
Mike       42      NaN         12  42.0
Jenna     NaN      NaN         15   NaN
Jon        21        4          1  12.5

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...