Let's try breaking the below code.
- First, group your dataframe by col1, and then perform
.agg
on the grouped object.
- We will then use a
lambda
function on col2 to get all of it's elements in a list
- Let's use the argument
'first'
, to show that we want to keep only the first element of col3 and col4
- Then, reset the index.
agg_df = (df.groupby('col1')
.agg({'col2': lambda x: x.tolist(),'col3':'first','col4':'first'})
.reset_index())
print(agg_df)
col1 col2 col3 col4
0 ID1 [DE, DZ] 69 min-8
1 ID3 [DA, AC, RC] 54 min-15
2 ID7 [XC] 4 min-7
3 ID8 [UC, TC, VC, WC] 2 min-40
To then convert col2 from having it's values stored in a list
to proper string
, we can join
it's elements using a ,
:
agg_df['col2'].apply(lambda x: ','.join(str(i) for i in x))
Out[16]:
0 DE,DZ
1 DA,AC,RC
2 XC
3 UC,TC,VC,WC
Name: col2, dtype: object
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…