Use pivot
or unstack
with set_index
and last add_prefix
:
df = df.pivot(index='id', columns='mins', values='param').add_prefix('param_')
print (df)
mins param_10 param_11 param_12
id
1 0.15 0.16 0.17
2 0.20 0.21 0.22
df = df.set_index(['id','mins'])['param'].unstack().add_prefix('param_')
print (df)
mins param_10 param_11 param_12
id
1 0.15 0.16 0.17
2 0.20 0.21 0.22
If error:
ValueError: Index contains duplicate entries, cannot reshape
it means duplicates, need aggregation. Use pivot_table
or groupby
with function like mean
( sum
, median
, ...) and last unstack
:
print (df)
id mins param
0 1 10 0.15 <- id mins dupe
1 1 10 0.50 <- id mins dupe
2 1 11 0.16
3 1 12 0.17
4 2 10 0.20
5 2 11 0.21
6 2 12 0.22
df = df.pivot_table(index='id', columns='mins', values='param', aggfunc='mean')
.add_prefix('param_')
print (df)
mins param_10 param_11 param_12
id
1 0.325 0.16 0.17 <- (0.15+0.5)/2 = 0.325
2 0.200 0.21 0.22
df = df.groupby(['id','mins'])['param'].mean().unstack().add_prefix('param_')
print (df)
mins param_10 param_11 param_12
id
1 0.325 0.16 0.17 <- (0.15+0.5)/2 = 0.325
2 0.200 0.21 0.22
For cleaning set columns name
to None
by rename_axis
and use reset_index
for column id
:
df = df.rename_axis(None, axis=1).reset_index()
print (df)
id param_10 param_11 param_12
0 1 0.325 0.16 0.17
1 2 0.200 0.21 0.22
EDIT:
Solution with multiple columns:
df['param1'] = df['param'] / 4
print (df)
id mins param param1
0 1 10 0.15 0.0375
1 1 10 0.50 0.1250
2 1 11 0.16 0.0400
3 1 12 0.17 0.0425
4 2 10 0.20 0.0500
5 2 11 0.21 0.0525
6 2 12 0.22 0.0550
df = df.pivot_table(index='id', columns='mins', values=['param', 'param1'], aggfunc='mean')
df.columns = ['_'.join((x[0], str(x[1]))) for x in df.columns]
print (df)
param_10 param_11 param_12 param1_10 param1_11 param1_12
id
1 0.325 0.16 0.17 0.08125 0.0400 0.0425
2 0.200 0.21 0.22 0.05000 0.0525 0.0550
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…