Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
514 views
in Technique[技术] by (71.8m points)

r - Using summarise, across, and quantile functions together

I am trying to use mtcars dataset to calculate summary statistics. Here is my code -

df <- as_tibble(mtcars)


df.sum2 <- df %>%
  select(mpg, cyl, vs, am, gear, carb) %>% 
  mutate(across(where(is.factor), as.numeric)) %>% 
  summarise(across(
    .cols = everything(), 
    .fns = list(
                Min = min, 
                Q25 = quantile (., 0.25), 
                Median = median, 
                Q75 = quantile (., 0.75), 
                Max = max,
                Mean = mean, 
                StdDev = sd,
                N = n()
                ), na.rm = T,
   .names = "{col}_{fn}"
                   )
            )

But I got the following error -

Error: Problem with summarise() input ..1. x Can't subset columns that don't exist. x Locations 65, 66, 69, 71, 76, etc. don't exist. i There are only 6 columns. i Input ..1 is across(...).

If I take out the Q25 = quantile (.,0.25) and Q75 = quantile (.,0.75) from the above code, it works fine. Actually, I can get the expected results using the following codes -

df.sum <- df %>%
  select(mpg, cyl, vs, am, gear, carb) %>% # select variables to summarise
  summarise_each(funs(Min = min, 
                      Q25 = quantile (., 0.25), 
                      Median = median, 
                      Q75 = quantile (., 0.75), 
                      Max = max,
                      Mean = mean, 
                      StdDev = sd,
                      N = n()))

But I want to use the across function with the summarise function. I do not want to use the summarise_each function.

question from:https://stackoverflow.com/questions/65660971/using-summarise-across-and-quantile-functions-together

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You need to use an anonymous function or formula syntax while passing additional arguments. Try

library(dplyr)

df.sum2 <- df %>%
  select(mpg, cyl, vs, am, gear, carb) %>% 
  mutate(across(where(is.factor), as.numeric)) %>% 
  summarise(across(
    .cols = everything(), 
    .fns = list(
      Min = min, 
      Q25 = ~quantile(., 0.25), 
      Median = median, 
      Q75 = ~quantile(., 0.75), 
      Max = max,
      Mean = mean, 
      StdDev = sd,
      N = ~n()
    ),
    .names = "{col}_{fn}"
  )
  )

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...