Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
742 views
in Technique[技术] by (71.8m points)

r - combining tail with by in data.table

What's the best way to get the tail row of a data.table by a factor?

Say I have:

> dt <- data.table(category = c("A", "A", "B", "B", "B"), value = c(1,2,3,4,5))
> dt
   category value
1:        A     1
2:        A     2
3:        B     3
4:        B     4
5:        B     5

I want to get this, but I'm not sure the most efficient way to do it:

   category value
1:        A     2
2:        B     5
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

We can use last

 dt[,list(value=last(value)) , by = category]
 #     category value
 #1:        A     2
 #2:        B     5

If there are many columns

dt[, lapply(.SD, last), category]

Or another option if the data is ordered by 'category'

dt[!duplicated(category, fromLast=TRUE)]
#    category value
#1:        A     2
#2:        B     5

Or as @Frank mentioned

unique(dt, by="category", fromLast=TRUE)

Or we can use last directly on .SD (as @jangorecki mentioned in the comments)

dt[, last(.SD), category]

There is another last function from dplyr. So, if both the packages are loaded, it is best to specify the data.table::last so that it won't get masked.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...