Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
487 views
in Technique[技术] by (71.8m points)

r - Use a factor column in "by" and do not drop empty factors

Suppose I have a data.table:

x <- data.table(x=runif(3), group=factor(c('a','b','a'), levels=c('a','b','c')))

I want to know how many rows in x exist for each group:

x[, .N, by="group"]
#    group N
# 1:     a 2
# 2:     b 1

Question: is there some way to force the above by="group" to consider all levels of the factor group?

Notice how since I don't have any rows of with group 'c' in the table, I don't get a row for c.

Desired output:

x[, .N, by="group", ???] # somehow use all levels in `group`
#    group N
# 1:     a 2
# 2:     b 1
# 3:     c 0
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you are willing to run through the factor levels by enumerating them in i (rather than by setting by="group"), this will get you the hoped for results.

setkey(x, "group")
x[levels(group), .N, by=.EACHI]
#    group N
# 1:     a 2
# 2:     b 1
# 3:     c 0

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...