Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
385 views
in Technique[技术] by (71.8m points)

r - How to pass strings denoting expressions to dplyr 0.7 verbs?

I would like to understand how to pass strings representing expressions into dplyr, so that the variables mentioned in the string are evaluated as expressions on columns in the dataframe. The main vignette on this topic covers passing in quosures, and doesn't discuss strings at all.

It's clear that quosures are safer and clearer than strings when representing expressions, so of course we should avoid strings when quosures can be used instead. However, when working with tools outside the R ecosystem, such as javascript or YAML config files, one will often have to work with strings instead of quosures.

For example, say I want a function that does a grouped tally using expressions passed in by the user/caller. As expected, the following code doesn't work, since dplyr uses nonstandard evaluation to interpret the arguments to group_by.

library(tidyverse)

group_by_and_tally <- function(data, groups) {
  data %>%
    group_by(groups) %>%
    tally()
}

my_groups <- c('2 * cyl', 'am')
mtcars %>%
  group_by_and_tally(my_groups)
#> Error in grouped_df_impl(data, unname(vars), drop): Column `groups` is unknown

In dplyr 0.5 we would use standard evaluation, such as group_by_(.dots = groups), to handle this situation. Now that the underscore verbs are deprecated, how should we do this kind of thing in dplyr 0.7?

In the special case of expressions that are just column names we can use the solutions to this question, but they don't work for more complex expressions like 2 * cyl that aren't just a column name.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It's important to note that, in this simple example, we have control of how the expressions are created. So the best way to pass the expressions is to construct and pass quosures directly using quos():

library(tidyverse)
library(rlang)

group_by_and_tally <- function(data, groups) {
  data %>%
    group_by(UQS(groups)) %>%
    tally()
}

my_groups <- quos(2 * cyl, am)
mtcars %>%
  group_by_and_tally(my_groups)
#> # A tibble: 6 x 3
#> # Groups:   2 * cyl [?]
#>   `2 * cyl`    am     n
#>       <dbl> <dbl> <int>
#> 1         8     0     3
#> 2         8     1     8
#> 3        12     0     4
#> 4        12     1     3
#> 5        16     0    12
#> 6        16     1     2

However, if we receive the expressions from an outside source in the form of strings, we can simply parse the expressions first, which converts them to quosures:

my_groups <- c('2 * cyl', 'am')
my_groups <- my_groups %>% map(parse_quosure)
mtcars %>%
  group_by_and_tally(my_groups)
#> # A tibble: 6 x 3
#> # Groups:   2 * cyl [?]
#>   `2 * cyl`    am     n
#>       <dbl> <dbl> <int>
#> 1         8     0     3
#> 2         8     1     8
#> 3        12     0     4
#> 4        12     1     3
#> 5        16     0    12
#> 6        16     1     2

Again, we should only do this if we are getting expressions from an outside source that provides them as strings - otherwise we should make quosures directly in the R source code.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...