From a data frame with timestamped rows (strptime results), what is the best method for aggregating statistics for intervals?
Intervals could be an hour, a day, etc.
There's the aggregate
function, but that doesn't help with assigning each row to an interval. I'm planning on adding a column to the data frame that denotes interval and using that with aggregate
, but if there's a better solution it'd be great to hear it.
Thanks for any pointers!
Example Data
Five rows with timestamps divided into 15-minute intervals starting at 03:00.
Interval 1
- "2010-01-13 03:02:38 UTC"
- "2010-01-13 03:08:14 UTC"
- "2010-01-13 03:14:52 UTC"
Interval 2
- "2010-01-13 03:20:42 UTC"
- "2010-01-13 03:22:19 UTC"
Conclusion
Using a time series package such as xts
should be the solution; however I had no success using them and winded up using cut
. As I presently only need to plot histograms, with rows grouped by interval, this was enough.
cut
is used liked so:
interv <- function(x, start, period, num.intervals) {
return(cut(x, as.POSIXlt(start)+0:num.intervals*period))
}
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…