I was surprised doing the following:
R) system.time(lastOrder <- order[,lapply(.SD,tail,1),by="TRADER_ID,EXEC_IDATE"]);
utilisateur système écoulé
1.45 0.00 1.53
R) nrow(order)
[1] 75301
R) ncol(order)
[1] 23
Thought it was very long, then I did
R) system.time(lastOrder <- order[,list(test=tail(EXEC_IDATE,1)),by="TRADER_ID,EXEC_IDATE"]);
utilisateur système écoulé
0.14 0.00 0.14
as far as I understand, if you know all the rows to select and work on most of the work is done, then I don't see why apply this to all columns should be 10x longer. Am I doing something wrong on the first bit of code, this is the only way I know to select last rows by group
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…