Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
640 views
in Technique[技术] by (71.8m points)

r - Mutating column in `dplyr` using `rowSums`

Recently I stumbled uppon a strange behaviour of dplyr and I would be happy if somebody would provide some insights.

Assuming I have a data of which com columns contain some numerical values. In an easy scenario I would like to compute rowSums. Although there are many ways to do it, here are two examples:

df <- data.frame(matrix(rnorm(20), 10, 2),
                 ids = paste("i", 1:20, sep = ""),
                 stringsAsFactors = FALSE)

# works
dplyr::select(df, - ids) %>% {rowSums(.)}

# does not work
# Error: invalid argument to unary operator
df %>%
  dplyr::mutate(blubb = dplyr::select(df, - ids) %>% {rowSums(.)})

# does not work
# Error: invalid argument to unary operator
df %>%
  dplyr::mutate(blubb = dplyr::select(., - ids) %>% {rowSums(.)})

# workaround:
tmp <- dplyr::select(df, - ids) %>% {rowSums(.)}
df %>%
  dplyr::mutate(blubb = tmp)

# works
rowSums(dplyr::select(df, - ids))

# does not work
# Error: invalid argument to unary operator
df %>%
  dplyr::mutate(blubb = rowSums(dplyr::select(df, - ids)))

# workaround
tmp <- rowSums(dplyr::select(df, - ids))
df %>%
  dplyr::mutate(blubb = tmp)

First, I don't really understand what is causing the error and second I would like to know how to actually achieve a tidy computation of some (viable) columns in a tidy way.

edit

The question mutate and rowSums exclude columns , although related, focuses on using rowSums for computation. Here I'm eager to understand why the upper examples do not work. It is not so much about how to solve (see the workarounds) but to understand what happens when the naive approach is applied.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The examples do not work because you are nesting select in mutate and using bare variable names. In this case, select is trying to do something like

> -df$ids
Error in -df$ids : invalid argument to unary operator

which fails because you can't negate a character string (i.e. -"i1" or -"i2" makes no sense). Either of the formulations below works:

df %>% mutate(blubb = rowSums(select_(., "X1", "X2")))
df %>% mutate(blubb = rowSums(select(., -3)))

or

df %>% mutate(blubb = rowSums(select_(., "-ids")))

as suggested by @Haboryme.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...