Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
359 views
in Technique[技术] by (71.8m points)

r - Filter group of rows based on sum of values from different column

I'm trying to filter out whole rows in R, but only if the frequencies for a particular set don't add up to more than 5.

The data I have looks a bit like this. It's a dataframe that I'm currently calling "Words":

HEADWORD VARIANT FREQUENCY
 SWORD    sword      2
 SWORD    swerd      1
 SWORD    sworde     1
 KNIGHT   knight     6
 KNIGHT   kniht      2
 KNIGHT   knyt       1

I only want rows for which the frequencies within a particular headword add up to more than 5. So here, I want to keep all the instances of KNIGHT but I want to get rid of all the SWORD rows entirely.

I tried to do this on dplyr, but with no success. This is the code I tried:

Words1 %>% group_by(HW) %>%  filter(Fr > 5)
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

We need to get the sum of 'FREQUENCY' and check whether it is greater than 5 in the filter after grouping by 'HEADWORD'

Words1 %>% 
     group_by(HEADWORD) %>% 
     filter(sum(FREQUENCY) >5)   
#   HEADWORD VARIANT FREQUENCY
#     <chr>   <chr>     <int>
#1   KNIGHT  knight         6
#2   KNIGHT   kniht         2 
#3   KNIGHT    knyt         1

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...