Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
317 views
in Technique[技术] by (71.8m points)

Pandas multiindex sum on one slice, set others to zero

I have a three-dim Series like this

start_date = '2020-01-01'
end_date = '2020-03-01'
date_range = pd.date_range(start_date, end_date, freq='MS')
axis_1, axis_2 = ['A','B'], ['red','blue']
iterables = [date_range, axis_1, axis_2 ]
index_names = ['time', 'level 1','level 2']
multi_index = pd.MultiIndex.from_product(iterables, names=index_names)
df = pd.Series(_data, index=multi_index)

which looks like this:

level 0     level 1  level 2
2020-01-01  A        red        0.5
                     blue       0.5
            B        red        0.5
                     blue       0.5
2020-02-01  A        red        0.5
                     blue       0.5
            B        red        0.2
                     blue       0.2
2020-03-01  A        red        0.2
                     blue       0.2
            B        red        0.2
                     blue       0.2
dtype: float64

I want to sum up along level [0,1], in other words red+blue for each row. However, I want to assign the sum to red and set blue to zero.

Like this:

level 0     level 1  level 2
2020-01-01  A        red        1.0
                     blue       0.0
            B        red        1.0
                     blue       0.0
2020-02-01  A        red        1.0
                     blue       0.0
            B        red        0.4
                     blue       0.0
2020-03-01  A        red        0.4
                     blue       0.0
            B        red        0.4
                     blue       0.0
dtype: float64

Help is much appreciated.

In my actual DF, the number different values for all levels is large, so I don't want explicitly refer to these, except, of course, to the label 'red' that I want to assign to.

question from:https://stackoverflow.com/questions/65909367/pandas-multiindex-sum-on-one-slice-set-others-to-zero

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

IIUC, you can try something like this:

df_sum = df.sum(level=[0,1]).to_frame()
df_sum['level 2'] = 'red'
df_sum.set_index('level 2', append=True).reindex(df.index, fill_value=0)[0]

Output:

time        level 1  level 2
2020-01-01  A        red        1.0
                     blue       0.0
            B        red        1.0
                     blue       0.0
2020-02-01  A        red        1.0
                     blue       0.0
            B        red        0.4
                     blue       0.0
2020-03-01  A        red        0.4
                     blue       0.0
            B        red        0.4
                     blue       0.0
Name: 0, dtype: float64

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...