Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
373 views
in Technique[技术] by (71.8m points)

r - How to select rows according to column value conditions

I have a data set which looks like the following (partially):

id  name    dummy
1   Jane    1
1   Jane    0
1   Jane    1
2   Mike    0
2   Mike    0
2   Mike    0
2   Mike    0
2   Mike    0
3   Tom     1
3   Tom     1
3   Tom     0
3   Tom     0

I'm trying to eliminate the people where ALL of the variable dummy is 0. So for instance, Tom and Jane would not be eliminated because they have dummy variable 0 or 1, but Mike will be eliminated because he has all 0s. So I would want in the end

   id   name    dummy
    1   Jane    1
    1   Jane    0
    1   Jane    1
    3   Tom     1
    3   Tom     1
    3   Tom     0
    3   Tom     0

I thought about sorting the data frame according to dummy but I can't seem to figure out how to deal with the fact that I'm only trying to eliminate the people who only has 0 values for the variable dummy. Any suggestions would be really helpful!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Consider df is your data.frame, then use tapply and [ to subset what you want:

> ind <- with(df, tapply(dummy, name, sum))
> df[df$name %in% names(ind)[ind!=0], ]
   id name dummy
1   1 Jane     1
2   1 Jane     0
3   1 Jane     1
9   3  Tom     1
10  3  Tom     1
11  3  Tom     0
12  3  Tom     0

Another alternative:

> result <- split(df, df$name)[with(df, tapply(dummy, name, function(x) sum(x)!=0))]
> do.call(rbind, result)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...