I have the following data.table
set.seed(1)
DT <- data.table(VAL = sample(c(1, 2, 3), 10, replace = TRUE))
VAL
1: 1
2: 2
3: 2
4: 3
5: 1
6: 3
7: 3
8: 2
9: 2
10: 1
Within each number in VAL
I want to:
- Count the number of records/rows
- Create an row index (counter) of first, second, third occurrence et c.
At the end I want the result
VAL COUNT IDX
1: 1 3 1
2: 2 4 1
3: 2 4 2
4: 3 3 1
5: 1 3 2
6: 3 3 2
7: 3 3 3
8: 2 4 3
9: 2 4 4
10: 1 3 3
where "COUNT" is the number of records/rows for each "VAL", and "IDX" is the row index within each "VAL".
I tried to work with which
and length
using .I
:
dt[, list(COUNT = length(VAL == VAL[.I]),
IDX = which(which(VAL == VAL[.I]) == .I))]
but this does not work as .I
refers to a vector with the index, so I guess one must use .I[]
. Though inside .I[]
I again face the problem, that I do not have the row index and I do know (from reading data.table
FAQ and following the posts here) that looping through rows should be avoided if possible.
So, what's the data.table
way?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…