I have the following dataframe structure that is indexed with a timestamp:
neg neu norm pol pos date
time
1520353341 0.000 1.000 0.0000 0.000000 0.000
1520353342 0.121 0.879 -0.2960 0.347851 0.000
1520353342 0.217 0.783 -0.6124 0.465833 0.000
I create a date from the timestamp:
data_frame['date'] = [datetime.datetime.fromtimestamp(d) for d in data_frame.time]
Result:
neg neu norm pol pos date
time
1520353341 0.000 1.000 0.0000 0.000000 0.000 2018-03-06 10:22:21
1520353342 0.121 0.879 -0.2960 0.347851 0.000 2018-03-06 10:22:22
1520353342 0.217 0.783 -0.6124 0.465833 0.000 2018-03-06 10:22:22
I want to group by hour, while getting the mean for all the values, except the timestamp, that should be the hour from where the group started. So this is the result I want to archive:
neg neu norm pol pos
time
1520352000 0.027989 0.893233 0.122535 0.221079 0.078779
1520355600 0.028861 0.899321 0.103698 0.209353 0.071811
The closest I have gotten so far has been with this answer:
data = data.groupby(data.date.dt.hour).mean()
Results:
neg neu norm pol pos
date
0 0.027989 0.893233 0.122535 0.221079 0.078779
1 0.028861 0.899321 0.103698 0.209353 0.071811
But I cant figure out how to keep the timestamp that takes in account he hour where the grouby started.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…