Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
745 views
in Technique[技术] by (71.8m points)

r - Calculate min and max (range) by group

I have something like this in a data frame:

PersonId Date_Withdrawal
       A      2012-05-01   
       A      2012-06-01
       B      2012-05-01
       C      2012-05-01
       A      2012-07-01
       A      2012-10-01
       B      2012-08-01
       B      2012-12-01
       C      2012-07-01

I'd like to obtain the min and max date by 'PersonId'

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

First, convert to a proper date class (always a good practice) and then you could run a simple range by group. Here's an attempt

library(data.table)
setDT(df)[, Date_Withdrawal := as.IDate(Date_Withdrawal)]
df[, as.list(range(Date_Withdrawal)), by = PersonId]
#    PersonId         V1         V2
# 1:        A 2012-05-01 2012-10-01
# 2:        B 2012-05-01 2012-12-01
# 3:        C 2012-05-01 2012-07-01

Or

library(dplyr)
df %>%
  mutate(Date_Withdrawal = as.Date(Date_Withdrawal)) %>%
  group_by(PersonId) %>%
  summarise(Min = min(Date_Withdrawal), Max = max(Date_Withdrawal))
# Source: local data frame [3 x 3]
# 
#  PersonId        Min        Max
#    (fctr)     (date)     (date)
# 1        A 2012-05-01 2012-10-01
# 2        B 2012-05-01 2012-12-01
# 3        C 2012-05-01 2012-07-01

P.S. base aggregate would look like aggregate(as.Date(Date_Withdrawal) ~ PersonId, df, range) but it refuses to retain classes .


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...