I'm posting this because the topic just got brought up in another question/answer and the behavior isn't very well documented.
Consider the dataframe df
df = pd.DataFrame(dict(
A=list('xxxyyy'),
B=[np.nan, 1, 2, 3, 4, np.nan]
))
A B
0 x NaN
1 x 1.0
2 x 2.0
3 y 3.0
4 y 4.0
5 y NaN
I wanted to get the first and last rows of each group defined by column 'A'
.
I tried
df.groupby('A').B.agg(['first', 'last'])
first last
A
x 1.0 2.0
y 3.0 4.0
However, This doesn't give me the np.NaN
s that I expected.
How do I get the actual first and last values in each group?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…