You can use pivot
for reshaping - you get NaN
in missing values by column time
, then unstack
with reset_index
and sort_values
:
import pandas as pd
df = pd.DataFrame({'date': {0: 20100201, 1: 20100201, 2: 20100201, 3: 20100201, 4: 20100202, 5: 20100202, 6: 20100202, 7: 20100202, 8: 20100203, 9: 20100203, 10: 20100204},
'time': {0: 0, 1: 6, 2: 12, 3: 18, 4: 0, 5: 6, 6: 12, 7: 18, 8: 0, 9: 18, 10: 6},
'value': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 6, 6: 7, 7: 8, 8: 9, 9: 11, 10: 12}})
print (df)
date time value
0 20100201 0 1
1 20100201 6 2
2 20100201 12 3
3 20100201 18 4
4 20100202 0 5
5 20100202 6 6
6 20100202 12 7
7 20100202 18 8
8 20100203 0 9
9 20100203 18 11
10 20100204 6 12
print (df.pivot(index='date', columns='time', values='value')
.unstack()
.reset_index(name='value')
.sort_values('date'))
time date value
0 0 20100201 1.0
4 6 20100201 2.0
8 12 20100201 3.0
12 18 20100201 4.0
1 0 20100202 5.0
5 6 20100202 6.0
9 12 20100202 7.0
13 18 20100202 8.0
2 0 20100203 9.0
6 6 20100203 NaN
10 12 20100203 NaN
14 18 20100203 11.0
3 0 20100204 NaN
7 6 20100204 12.0
11 12 20100204 NaN
15 18 20100204 NaN
Maybe you can reset_index
again, if you need nice index
like:
print (df.pivot(index='date', columns='time', values='value')
.unstack()
.reset_index(name='value')
.sort_values('date')
.reset_index(drop=True))
time date value
0 0 20100201 1.0
1 6 20100201 2.0
2 12 20100201 3.0
3 18 20100201 4.0
4 0 20100202 5.0
5 6 20100202 6.0
6 12 20100202 7.0
7 18 20100202 8.0
8 0 20100203 9.0
9 6 20100203 NaN
10 12 20100203 NaN
11 18 20100203 11.0
12 0 20100204 NaN
13 6 20100204 12.0
14 12 20100204 NaN
15 18 20100204 NaN