Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
318 views
in Technique[技术] by (71.8m points)

r - How to get last data for each id/date?

I have a data frame that contains id, POSIXct(Date & Time)

> myData

   Tpt_ID    Tpt_DateTime               Value
1  1         2013-01-01 15:17:21 CST    10
2  2         2013-01-01 15:18:32 CST    5
3  3         2013-01-01 16:00:02 CST    1
4  1         2013-01-02 15:10:11 CST    15
5  2         2013-02-02 11:18:32 CST    6
6  3         2013-02-03 12:00:02 CST    2
7  1         2013-01-01 19:17:21 CST    21
8  2         2013-02-02 20:18:32 CST    8
9  3         2013-02-03 22:00:02 CST    3

I'd like to get last Value for each Date and ID

For example,

Tpt_ID   Tpt_DateTime               Value
2        2013-01-01 15:18:32 CST    5
3        2013-01-01 16:00:02 CST    1
1        2013-01-02 15:10:11 CST    15
1        2013-01-01 19:17:21 CST    21
2        2013-02-02 20:18:32 CST    8
3        2013-02-03 22:00:02 CST    3

Data sample:

structure(list(Tpt_ID = c(1, 2, 3, 1, 2, 3, 1, 2, 3), Tpt_DateTime = structure(c(1357024641, 1357024712, 1357027202, 1357110611, 1359775112, 1359864002, 1357039041, 1359807512, 1359900002), class = c("POSIXct", "POSIXt"), tzone = ""), Value = c(10, 5, 1, 15, 6, 2, 21, 8, 3)), .Names = c("Tpt_ID", "Tpt_DateTime", "Value"), row.names = c(NA, 9L), class = "data.frame")
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can do this pretty easily using data.table syntax...

#  Load package
require( data.table )

#  Turn 'data.frame' into 'data.table'
dt <- data.table( df )

#  Make dates from date/time
dt[ , Date:= as.Date( Tpt_DateTime ) ]

#  Get last row of each group
dt[ , .SD[.N] ,  by = c("Tpt_ID" , "Date") ]
#   Tpt_ID       Date        Tpt_DateTime Value
#1:      1 2013-01-01 2013-01-01 11:17:21    21
#2:      2 2013-01-01 2013-01-01 07:18:32     5
#3:      3 2013-01-01 2013-01-01 08:00:02     1
#4:      1 2013-01-02 2013-01-02 07:10:11    15
#5:      2 2013-02-02 2013-02-02 12:18:32     8
#6:      3 2013-02-03 2013-02-03 14:00:02     3
  • First we turn your data-time data into a date with Date := as.Date( Tpt_DateTime )

  • Then we use .SD to get a subset of X's data for each group. .N contains the number of row for each group, so .SD[.N] gives us the last row for each group.

  • Lastly, the by=c("Tpt_ID" , "Date") defines the groups.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...