I actually do this quite a bit: the best method I have found is using data.table
. Let's suppose your data is saved as ds
, then:
s <- read.table(text = "Visit Patient Admission Discharge
1 1 2015/01/01 2015/01/02
2 2 2015/01/01 2015/01/01
3 3 2015/01/01 2015/01/02
4 1 2015/01/09 2015/01/09
5 2 2015/04/01 2015/04/05
6 1 2015/05/01 2015/05/01", header = T, sep = "")
s$Admission <- as.POSIXct(s$Admission, format = "%Y/%m/%d")
s$Discharge <- as.POSIXct(s$Discharge, format = "%Y/%m/%d")
ds <- data.table(s)
setkey(ds, Patient, Admission)
ds <- ds[ , Daydiff := as.numeric(difftime(shift(Admission, n = 1L, fill = 999, type = "lead"), Discharge)),
by = "Patient"][ ,':='(Readmit30 = ifelse(abs(Daydiff) <= 30, 1, 0),
Readmit180= ifelse(abs(Daydiff) <= 180, 1, 0),
Daydiff = NULL)]
This results in
> ds
Visit Patient Admission Discharge Daydiff Readmit30 Readmit180
1: 1 1 2015-01-01 2015-01-02 7.00000 1 1
2: 4 1 2015-01-09 2015-01-09 111.95833 0 1
3: 6 1 2015-05-01 2015-05-01 -16556.15510 0 0
4: 2 2 2015-01-01 2015-01-01 89.95833 0 1
5: 5 2 2015-04-01 2015-04-05 -16530.15510 0 0
6: 3 3 2015-01-01 2015-01-02 -16437.19677 0 0
Please keep in mind though, this only determines which records have a subsequent 30-day or 180-day readmit, and not which are actually the 30-day or 180-day readmit.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…