Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
247 views
in Technique[技术] by (71.8m points)

r - Get expression that evaluated to dot in function called by `magrittr` pipe

I have a function x_expression() which prints the expression passed to argument x.

pacman::p_load(magrittr, rlang)

x_expression <- function(x) {
  print(enquo(x))
}

y <- 1

x_expression(y)
#> <quosure>
#>   expr: ^y
#>   env:  global

y %>% x_expression()
#> <quosure>
#>   expr: ^.
#>   env:  0x7ff27c36a610

So you can see that it knows y was passed to it, but when y is piped in with %>%, the function returns prints .. Is there a way to recover the y in the case that it is piped in, or is it gone forever? In brief, what I want is a function like x_expression() but one that would print y in both cases above.

This question is indeed similar to Get name of dataframe passed through pipe in R, however it is slightly more general. This person just wants the name of the data frame, I want the expression, whatever it is. However, the same answer will likely apply to both. I don't like the answer of this near-duplicate question, nor does the author of that answer.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

y is not "gone forever", because the pipe calls your function, and it also knows about y. There's a way to recover y, but it requires some traversal of the calling stack. To understand what's happening, we'll use ?sys.frames and ?sys.calls:

‘sys.calls’ and ‘sys.frames’ give a pairlist of all the active calls and frames, respectively, and ‘sys.parents’ returns an integer vector of indices of the parent frames of each of those frames.

If we sprinkle these throughout your x_expression(), we can see what happens when we call y %>% x_expression() from the global environment:

x_expression <- function(x) {
  print( enquo(x) )
  # <quosure>
  #   expr: ^.
  #   env:  0x55c03f142828                <---

  str(sys.frames())
  # Dotted pair list of 9
  #  $ :<environment: 0x55c03f151fa0> 
  #  $ :<environment: 0x55c03f142010> 
  #  ...
  #  $ :<environment: 0x55c03f142828>     <---
  #  $ :<environment: 0x55c03f142940>

  str(sys.calls())
  # Dotted pair list of 9
  #  $ : language y %>% x_expression()    <---
  #  $ : language withVisible(eval(...
  #  ...
  #  $ : language function_list[[k]...
  #  $ : language x_expression(.)
}

I highlighted the important parts with <---. Notice that the quosure captured by enquo lives in the parent environment of the function (second from the bottom of the stack), while the pipe call that knows about y is all the way at the top of the stack.

There's a couple of ways to traverse the stack. @MrFlick's answer to a similar question as well as this GitHub issue traverse the frames / environments from sys.frames(). Here, I will show an alternative that traverses sys.calls() and parses the expressions to find %>%.

The first piece of the puzzle is to define a function that converts an expression to its Abstract Sytax Tree(AST):

# Recursively constructs Abstract Syntax Tree for a given expression
getAST <- function(ee) purrr::map_if(as.list(ee), is.call, getAST)
# Example: getAST( quote(a %>% b) )
# List of 3
#  $ : symbol %>%
#  $ : symbol a
#  $ : symbol b

We can now systematically apply this function to the entire sys.calls() stack. The goal is to identify ASTs where the first element is %>%; the second element will then correspond to the left-hand side of the pipe (symbol a in the a %>% b example). If there is more than one such AST, then we're in a nested %>% pipe scenario. In this case, the last AST in the list will be the lowest in the calling stack and closest to our function.

x_expression2 <- function(x) {
  sc <- sys.calls()
  ASTs <- purrr::map( as.list(sc), getAST ) %>%
    purrr::keep( ~identical(.[[1]], quote(`%>%`)) )  # Match first element to %>%

  if( length(ASTs) == 0 ) return( enexpr(x) )        # Not in a pipe
  dplyr::last( ASTs )[[2]]    # Second element is the left-hand side
}

(Minor note: I used enexpr() instead of enquo() to ensure consistent behavior of the function in and out of the pipe. Since sys.calls() traversal returns an expression, not a quosure, we want to do the same in the default case as well.)

The new function is pretty robust and works inside other functions, including nested %>% pipes:

x_expression2(y)
# y

y %>% x_expression2()
# y

f <- function() {x_expression2(v)}
f()
# v

g <- function() {u <- 1; u %>% x_expression2()}
g()
# u

y %>% (function(z) {w <- 1; w %>% x_expression2()})  # Note the nested pipes
# w

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...