Let's say I have the following dataframe:
my_basket = data.frame(ITEM_GROUP = c("Fruit","Fruit","Fruit","Fruit","Fruit","Vegetable","Vegetable","Vegetable","Vegetable","Dairy","Dairy","Dairy","Dairy","Dairy"),
ITEM_NAME = c("Apple","Banana","Orange","Mango","Papaya","Carrot","Potato","Brinjal","Raddish","Milk","Curd","Cheese","Milk","Paneer"),
Price = c(100,80,80,90,65,70,60,70,25,60,40,35,50,NA),
Tax = c(2,4,5,6,2,3,5,1,3,4,5,6,4,NA))
This then yields:
> my_basket
ITEM_GROUP ITEM_NAME Price Tax
1 Fruit Apple 100 2
2 Fruit Banana 80 4
3 Fruit Orange 80 5
4 Fruit Mango 90 6
5 Fruit Papaya 65 2
6 Vegetable Carrot 70 3
7 Vegetable Potato 60 5
8 Vegetable Brinjal 70 1
9 Vegetable Raddish 25 3
10 Dairy Milk 60 4
11 Dairy Curd 40 5
12 Dairy Cheese 35 6
13 Dairy Milk 50 4
14 Dairy Paneer NA NA
What I now would like to do, is make a list of fruits I want to keep and then filter those, so:
fruitlist = c("Apple", "Banana")
How would I go about using tidyverse to filter the data in my data.frame to only keep the fruits in my fruitlist, but also all my Vegetables and Dairy? Normally I'd do:
my_basket %<>% filter(ITEM_NAME %in% fruitlist)
But then I'd also lose all the vegetables and dairy, which is not what I want. I've been trying to make something work with case_when but can't seem to make it work. There must be something obvious I'm missing here.
EDIT: Seconds after posting my question I finally realised:
my_basket %<>% filter(ITEM_NAME %in% fruitlist | ITEM_GROUP != "Fruit")
That solves it. I think if I'd have to filter multiple groups like this, piping the filter command repeatedly would work too.
question from:
https://stackoverflow.com/questions/66061964/only-filter-values-in-a-column-based-on-a-condition