Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
225 views
in Technique[技术] by (71.8m points)

python - How to group pairs based on shared item in pd dataframe?

I have a table below, item A, B, C, D, E are actually the same type but in the "Old_Group" column it was grouped separated by M1, M2, and M3.

Is there a way that can detect their group based on their shared item? In this case, M1 and M3 both have the shared item "A", so even though they have other different items, they can be seen as the same type of items, so they should be group together, and since item G and H don't appear in any other group, they will be assigned in another group. I would expect the result like the column "New_Group".

In the real table, there are much more items, so I'm wondering if there's a faster way to group it correctly, the group number of "New_Group" can assign a random but not duplicated number.

Thank you very much in advance.

item Old_Group New_Group
A M1 N1
B M1 N1
C M1 N1
A M2 N1
B M2 N1
A M3 N1
D M3 N1
E M3 N1
G M4 N2
H M4 N2
question from:https://stackoverflow.com/questions/65930248/how-to-group-pairs-based-on-shared-item-in-pd-dataframe

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is more like a network problem , try networkx

import networkx as nx
G=nx.from_pandas_edgelist(df, 'item', 'Old_Group')
l=pd.Series(list(nx.connected_components(G))).map(list).explode()
df['New'] = df.Old_Group.map(dict(zip(l,l.index)))
df
Out[75]: 
  item Old_Group New_Group  New
0    A        M1        N1    0
1    B        M1        N1    0
2    C        M1        N1    0
3    A        M2        N1    0
4    B        M2        N1    0
5    A        M3        N1    0
6    D        M3        N1    0
7    E        M3        N1    0
8    G        M4        N2    1
9    H        M4        N2    1

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...