r - Summarize data based on unique ID column

Question

Welcome To Ask or Share your Answers For Others

r - Summarize data based on unique ID column

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Summarize data based on unique ID column

I am trying to summarise multiple columns based on an ID column so I don't double count observations. I have managed to use tapply to get what I need for one variable at a time but can't do this for several variables at the same time.

In addition, the data frame I want to apply this to has +50,000 rows and I want to apply this to +10 different count variables. I was wondering if there is a better solution within dplyr as I ultimately want to create a Shiny Dashboard with this data.

I have replicated a small sample of the data and shown the existing cost.

#Creating data frame
df <- data.frame (ID = c(1, 1, 2, 3, 4, 4, 4),
                  Count = c(1, 1, 30, 15, 1, 1, 1),
                  Count2 = c(1, 1, 20, 10, 1, 1, 1),
                  Service = c("Service A", "Service B", "Service C", "Service D", 
                              "Service E", "Service F", "Service G"))

#Create object of variables to count
myvars <- c("Count", "Count2")

#Count number of unique frequencies for two groups
df %>% 
  group_by(ID) %>%
  summarise(value_sum = sum(tapply(myvars, ID, FUN = max))) %>% 
  summarise(value_sum = sum(value_sum))


#Count number of unique frequencies (code works for one variable at a time)
df %>% 
  group_by(ID) %>%
  summarise(value_sum = sum(tapply(Count, ID, FUN = max))) %>% 
  summarise(value_sum = sum(value_sum))

df %>% 
  group_by(ID) %>%
  summarise(value_sum = sum(tapply(Count2, ID, FUN = max))) %>% 
  summarise(value_sum = sum(value_sum))

question from:https://stackoverflow.com/questions/65884142/summarize-data-based-on-unique-id-column

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T19:20:36+0000

You can use across() to work on multiple variables at the same time within summarise(). In your case:

df %>% 
  group_by(ID) %>% 
  summarise(across(myvars, max)) %>% 
  summarise(across(myvars, sum))

Categories

r - Summarize data based on unique ID column

r - Summarize data based on unique ID column

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags