Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
385 views
in Technique[技术] by (71.8m points)

r - Why are my dplyr group_by & summarize not working properly? (name-collision with plyr)

I have a data frame that looks like this:

#df
ID  DRUG FED  AUC0t  Tmax   Cmax
1    1     0   100     5      20
2    1     1   200     6      25
3    0     1   NA      2      30 
4    0     0   150     6      65

Ans so on. I want to summarize some statistics on AUC, Tmax and Cmax by drug DRUG and FED STATUS FED. I use dplyr. For example: for the AUC:

CI90lo <- function(x) quantile(x, probs=0.05, na.rm=TRUE)
CI90hi <- function(x) quantile(x, probs=0.95, na.rm=TRUE)  

summary <- df %>%
             group_by(DRUG,FED) %>%
             summarize(mean=mean(AUC0t, na.rm=TRUE), 
                                 low = CI90lo(AUC0t), 
                                 high= CI90hi(AUC0t),
                                 min=min(AUC0t, na.rm=TRUE),
                                 max=max(AUC0t,na.rm=TRUE), 
                                 sd= sd(AUC0t, na.rm=TRUE))

However, the output is not grouped by DRUG and FED. It gives only one line containing the statistics of all by not faceted on DRUG and FED.

Any idea why? and how can I make it do the right thing?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I believe you've loaded plyr after dplyr, which is why you are getting an overall summary instead of a grouped summary.

This is what happens with plyr loaded last.

library(dplyr)
library(plyr)
df %>%
      group_by(DRUG,FED) %>%
      summarize(mean=mean(AUC0t, na.rm=TRUE), 
                low = CI90lo(AUC0t), 
                 high= CI90hi(AUC0t),
                 min=min(AUC0t, na.rm=TRUE),
                 max=max(AUC0t,na.rm=TRUE), 
                 sd= sd(AUC0t, na.rm=TRUE))

  mean low high min max sd
1  150 105  195 100 200 50

Now remove plyr and try again and you get the grouped summary.

detach(package:plyr)
df %>%
      group_by(DRUG,FED) %>%
      summarize(mean=mean(AUC0t, na.rm=TRUE), 
                low = CI90lo(AUC0t), 
                 high= CI90hi(AUC0t),
                 min=min(AUC0t, na.rm=TRUE),
                 max=max(AUC0t,na.rm=TRUE), 
                 sd= sd(AUC0t, na.rm=TRUE))

Source: local data frame [4 x 8]
Groups: DRUG

  DRUG FED mean low high min max  sd
1    0   0  150 150  150 150 150 NaN
2    0   1  NaN  NA   NA  NA  NA NaN
3    1   0  100 100  100 100 100 NaN
4    1   1  200 200  200 200 200 NaN

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...