r - Why does summarize or mutate not work with group_by when I load `plyr` after `dplyr`?

Question

Welcome To Ask or Share your Answers For Others

r - Why does summarize or mutate not work with group_by when I load `plyr` after `dplyr`?

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Why does summarize or mutate not work with group_by when I load `plyr` after `dplyr`?

Note: The title of this question has been edited to make it the canonical question for issues when plyr functions mask their dplyr counterparts. The rest of the question remains unchanged.

Suppose I have the following data:

dfx <- data.frame(
  group = c(rep('A', 8), rep('B', 15), rep('C', 6)),
  sex = sample(c("M", "F"), size = 29, replace = TRUE),
  age = runif(n = 29, min = 18, max = 54)
)

With the good old plyr I can create a little table summarizing my data with the following code:

require(plyr)
ddply(dfx, .(group, sex), summarize,
      mean = round(mean(age), 2),
      sd = round(sd(age), 2))

The output look like this:

  group sex  mean    sd
1     A   F 49.68  5.68
2     A   M 32.21  6.27
3     B   F 31.87  9.80
4     B   M 37.54  9.73
5     C   F 40.61 15.21
6     C   M 36.33 11.33

I'm trying to move my code to dplyr and the %>% operator. My code takes DF then group it by group and sex and then summarise it. That is:

dfx %>% group_by(group, sex) %>% 
  summarise(mean = round(mean(age), 2), sd = round(sd(age), 2))

But my output is:

  mean   sd
1 35.56 9.92

What am I doing wrong?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-16T21:10:12+0000

The problem here is that you are loading dplyr first and then plyr, so plyr's function summarise is masking dplyr's function summarise. When that happens you get this warning:

library(plyr)
    Loading required package: plyr
------------------------------------------------------------------------------------------
You have loaded plyr after dplyr - this is likely to cause problems.
If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
library(plyr); library(dplyr)
------------------------------------------------------------------------------------------

Attaching package: ‘plyr’

The following objects are masked from ‘package:dplyr’:

    arrange, desc, failwith, id, mutate, summarise, summarize

So in order for your code to work, either detach plyr detach(package:plyr) or restart R and load plyr first and then dplyr (or load only dplyr):

library(dplyr)
dfx %>% group_by(group, sex) %>% 
  summarise(mean = round(mean(age), 2), sd = round(sd(age), 2))
Source: local data frame [6 x 4]
Groups: group

  group sex  mean    sd
1     A   F 41.51  8.24
2     A   M 32.23 11.85
3     B   F 38.79 11.93
4     B   M 31.00  7.92
5     C   F 24.97  7.46
6     C   M 36.17  9.11

Or you can explicitly call dplyr's summarise in your code, so the right function will be called no matter how you load the packages:

dfx %>% group_by(group, sex) %>% 
  dplyr::summarise(mean = round(mean(age), 2), sd = round(sd(age), 2))

Categories

r - Why does summarize or mutate not work with group_by when I load `plyr` after `dplyr`?

r - Why does summarize or mutate not work with group_by when I load `plyr` after `dplyr`?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags