r - Index unique values in data.table

Question

Welcome To Ask or Share your Answers For Others

r - Index unique values in data.table

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Index unique values in data.table

Not sure how to formulate the question in words, but how can I create an index-column for a data.table that per group increments when a different value appear?

Here is the MWE

library(data.table)
in.data <- data.table(fruits=c(rep("banana", 4), rep("pear", 5)),vendor=c("a", "b", "b", "c", "d", "d", "e", "f", "f"))

Here is the result the R-code should generate

in.data[, wanted.column:=c(1,2,2,3,1,1,2,3,3)]

#    fruits vendor wanted.column
# 1: banana      a             1
# 2: banana      b             2
# 3: banana      b             2
# 4: banana      c             3
# 5:   pear      d             1
# 6:   pear      d             1
# 7:   pear      e             2
# 8:   pear      f             3
# 9:   pear      f             3

So it labels each vendor 1, 2, 3, ... within each fruit. There is probably a very simple solution, but I'm stuck.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T01:11:06+0000

I have a few ideas. You can use a nested group counter:

in.data[, w := setDT(list(v = vendor))[, g := .GRP, by=v]$g, by=fruits]

Alternately, make a run ID, which depends on sorted data (thanks @eddi) and seems wasteful:

in.data[, w := rleid(vendor), by=fruits]

The base-R approach would probably be:

in.data[, w := match(vendor, unique(vendor)), by=fruits]

# or in base R ...

in.data$w = with(in.data, ave(vendor, fruits, FUN = function(x) match(x, unique(x))))

Categories

r - Index unique values in data.table

r - Index unique values in data.table

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags