This should do it, need groupby()
twice:
df.groupby(['name', 'day']).sum()
.groupby(level=0).cumsum().reset_index()
Explanation:
print(df)
name day no
0 Jack Monday 10
1 Jack Tuesday 20
2 Jack Tuesday 10
3 Jack Wednesday 50
4 Jill Monday 40
5 Jill Wednesday 110
# sum per name/day
print( df.groupby(['name', 'day']).sum() )
no
name day
Jack Monday 10
Tuesday 30
Wednesday 50
Jill Monday 40
Wednesday 110
# cumulative sum per name/day
print( df.groupby(['name', 'day']).sum()
.groupby(level=0).cumsum() )
no
name day
Jack Monday 10
Tuesday 40
Wednesday 90
Jill Monday 40
Wednesday 150
The dataframe resulting from the first sum is indexed by 'name'
and by 'day'
. You can see it by printing
df.groupby(['name', 'day']).sum().index
When computing the cumulative sum, you want to do so by 'name'
, corresponding to the first index (level 0).
Finally, use reset_index
to have the names repeated.
df.groupby(['name', 'day']).sum().groupby(level=0).cumsum().reset_index()
name day no
0 Jack Monday 10
1 Jack Tuesday 40
2 Jack Wednesday 90
3 Jill Monday 40
4 Jill Wednesday 150
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…