Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
403 views
in Technique[技术] by (71.8m points)

r - generate id for each group with repeated and missing observations

I have a dataset with individuals observed over several weeks. Some individuals have no observations in some weeks, and some have several observations during the same week. I need to create a weekly ID(id_week in the code) that would be individual-specific. If an individual have two or more observations in one week, id_week should be the same for both observations. If an individual have no observations in a given week, the observation in a next week should be consuequent from the last observed point. This would result in a following data:

dt<-data.frame(individ=c(1,1,1,2,2,2,3,3,3,3),week=c(1,2,2,1,2,4,1,3,4,4),id_week=c(1,2,2,1,2,3,1,2,3,3))

I have tride dt[, id := .GRP, by = .(individ, week)] but it gives me just ID for weeks, not taken individuals into account. I also tried dplyr solution but it does not account for repeated observations within one week, assigning an ID to every line, which is not what I need.

dt%>%
group_by(individ)%>%
mutate(pp = row_number(week))
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here are few alternatives :

1) Using dense_rank :

library(dplyr)
dt %>% group_by(individ) %>% mutate(id_week = dense_rank(week))

2) Using match and unique :

dt$id_week <- with(dt, ave(week, individ, FUN = function(x) match(x, unique(x))))

3) Converting to factor and then integer :

library(data.table)
setDT(dt)[, id_week := as.integer(factor(week)), individ]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...