python - Is there an "ungroup by" operation opposite to .groupby in pandas?

Question

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

Suppose we take a pandas dataframe...

    name  age  family
0   john    1       1
1  jason   36       1
2   jane   32       1
3   jack   26       2
4  james   30       2

Then do a groupby() ...

group_df = df.groupby('family')
group_df = group_df.aggregate({'name': name_join, 'age': pd.np.mean})

Then do some aggregate/summarize operation (in my example, my function name_join aggregates the names):

def name_join(list_names, concat='-'):
    return concat.join(list_names)

The grouped summarized output is thus:

        age             name

family                      
1        23  john-jason-jane
2        28       jack-james

Question:

Is there a quick, efficient way to get to the following from the aggregated table?

    name  age  family
0   john   23       1
1  jason   23       1
2   jane   23       1
3   jack   28       2
4  james   28       2

(Note: the age column values are just examples, I don't care for the information I am losing after averaging in this specific example)

The way I thought I could do it does not look too efficient:

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

深蓝 · Answer 1 · 2021-10-17T02:51:15+0000

The rough equivalent is .reset_index(), but it may not be helpful to think of it as the "opposite" of groupby().

You are splitting a string in to pieces, and maintaining each piece's association with 'family'. This old answer of mine does the job.

Just set 'family' as the index column first, refer to the link above, and then reset_index() at the end to get your desired result.