Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
957 views
in Technique[技术] by (71.8m points)

python - pandas divide row value by aggregated sum with a condition set by other cell

Hi Hoping to get some help, I have two columns Dataframe df as;

Source ID
1      2
2      3
1      2
1      2
1      3
3      1

My intention is to group the Source and divide the ID cell by total based on the grouped Source and attach this to the orginial dataframe so the new column would look like;

   Source ID  ID_new
    1      2  2/9
    2      3  3/3
    1      2  2/9
    1      2  2/9
    1      3  3/9
    3      1  3/1

I've gotten as far as;

df.groupby('Source ID')['ID'].sum()

to get the total for ID but Im not sure where to go next.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

try this:

In [79]: df.assign(ID_new=df.ID/df.groupby('Source').ID.transform('sum'))
Out[79]:
   Source  ID    ID_new
0       1   2  0.222222
1       2   3  1.000000
2       1   2  0.222222
3       1   2  0.222222
4       1   3  0.333333
5       3   1  1.000000

if you need it as a new persistent column you can do it as @jezrael proposed in the comment:

In [81]: df['ID_new'] = df.ID/df.groupby('Source').ID.transform('sum')

In [82]: df
Out[82]:
   Source  ID    ID_new
0       1   2  0.222222
1       2   3  1.000000
2       1   2  0.222222
3       1   2  0.222222
4       1   3  0.333333
5       3   1  1.000000

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...