Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
519 views
in Technique[技术] by (71.8m points)

python - Cumulative Sum Function on Pandas Data Frame

I am attempting to capture a "running" cumulative sum given a series of period amounts.

See example:

enter image description here

df = df[1:4].cumsum() # this doesn't return the desired result
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You're looking for the axis parameter. Many Pandas functions take this argument to apply an operation across the columns or across the rows. Use axis=0 to apply row-wise and axis=1 to apply column-wise. This operation is actually traversing the columns, so you want axis=1.

df.cumsum(axis=1) by itself works on your example to produce the output table.

In [3]: df.cumsum(axis=1)
Out[3]:
      1   2   3   4
10   16  30  41  61
51   13  29  40  50
13   11  30  45  61
321  12  27  37  52

I suspect you're interested in restricting to a specific range of columns, though. To do that, you can use .loc with the column labels (strings in mine).

In [4]: df.loc[:, '2':'3'].cumsum(axis=1)
Out[4]:
      2   3
10   14  25
51   16  27
13   19  34
321  15  25

.loc is label-based and is inclusive of the bounds. If you want to find out more about indexing in Pandas, check the docs.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...