I have two dataframes each with multiple rows per ID. I need to return the closest date and related data from the second dataframe based on the ID and date of the first dataframe - adding the related data to the first dataframe. This also has to work with NA
s present in the second dataframe. Example data:
set.seed(42)
df1 <- data.frame(ID = sample(1:3, 10, rep=T), dateTarget=(strptime((paste
(sprintf("%02d", sample(1:30,10, rep=T)), sprintf("%02d", sample(1:12,10, rep=T)),
(sprintf("%02d", sample(2013:2015,10, rep=T))), sep="")),"%d%m%Y")), Value=sample(15:100, 10, rep=T))
df2 <- data.frame(ID = sample(1:3, 10, rep=T), dateTarget=(strptime((paste
(sprintf("%02d", sample(1:30,20, rep=T)), sprintf("%02d", sample(1:12,20, rep=T)),
(sprintf("%02d", sample(2013:2015,20, rep=T))), sep="")),"%d%m%Y")), ValueMatch=sample(15:100, 20, rep=T))
Something from base
preferable - split
and a mixture of the apply
family?
The final table would look something like:
ID dateTarget Value dateMatch ValueMatch
1 3 22-02-15 52 09-03-15 94
2 1 29-12-14 18 06-12-14 88
3 3 08-12-15 98 06-07-15 48
4 2 14-01-13 52 08-04-13 77
5 2 29-07-15 97 01-08-15 68
6 3 30-05-13 91 01-04-13 85
7 1 04-11-13 70 21-02-14 35
8 2 15-06-15 98 01-08-15 68
9 3 17-11-14 68 15-12-14 95
P.S. Are there better ways of generating random dates (not using seq.Date
)?
See Question&Answers more detail:
os