Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
85 views
in Technique[技术] by (71.8m points)

r - Only filter values in a column based on a condition

Let's say I have the following dataframe:

my_basket = data.frame(ITEM_GROUP = c("Fruit","Fruit","Fruit","Fruit","Fruit","Vegetable","Vegetable","Vegetable","Vegetable","Dairy","Dairy","Dairy","Dairy","Dairy"), 
                   ITEM_NAME = c("Apple","Banana","Orange","Mango","Papaya","Carrot","Potato","Brinjal","Raddish","Milk","Curd","Cheese","Milk","Paneer"),
                   Price = c(100,80,80,90,65,70,60,70,25,60,40,35,50,NA),
                   Tax = c(2,4,5,6,2,3,5,1,3,4,5,6,4,NA))

This then yields:

    > my_basket
   ITEM_GROUP ITEM_NAME Price Tax
1       Fruit     Apple   100   2
2       Fruit    Banana    80   4
3       Fruit    Orange    80   5
4       Fruit     Mango    90   6
5       Fruit    Papaya    65   2
6   Vegetable    Carrot    70   3
7   Vegetable    Potato    60   5
8   Vegetable   Brinjal    70   1
9   Vegetable   Raddish    25   3
10      Dairy      Milk    60   4
11      Dairy      Curd    40   5
12      Dairy    Cheese    35   6
13      Dairy      Milk    50   4
14      Dairy    Paneer    NA  NA

What I now would like to do, is make a list of fruits I want to keep and then filter those, so:

fruitlist = c("Apple", "Banana")

How would I go about using tidyverse to filter the data in my data.frame to only keep the fruits in my fruitlist, but also all my Vegetables and Dairy? Normally I'd do:

my_basket %<>% filter(ITEM_NAME %in% fruitlist)

But then I'd also lose all the vegetables and dairy, which is not what I want. I've been trying to make something work with case_when but can't seem to make it work. There must be something obvious I'm missing here.

EDIT: Seconds after posting my question I finally realised:

my_basket %<>% filter(ITEM_NAME %in% fruitlist | ITEM_GROUP != "Fruit")

That solves it. I think if I'd have to filter multiple groups like this, piping the filter command repeatedly would work too.

question from:https://stackoverflow.com/questions/66061964/only-filter-values-in-a-column-based-on-a-condition

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You could use grepl with a regex alternation:

fruitlist <- c("Apple", "Banana")
regex <- paste0("^(?:", paste0(fruitlist, collapse="|"), ")$")
my_basket %<>% filter(grepl(regex, ITEM_NAME))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...