Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
966 views
in Technique[技术] by (71.8m points)

r - Calculating ratios by group with dplyr

Using the following dataframe I would like to group the data by replicate and group and then calculate a ratio of treatment values to control values.

structure(list(group = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L), .Label = c("case", "controls"), class = "factor"), treatment = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "EPA", class = "factor"), 
    replicate = structure(c(2L, 4L, 3L, 1L, 2L, 4L, 3L, 1L), .Label = c("four", 
    "one", "three", "two"), class = "factor"), fatty_acid_family = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "saturated", class = "factor"), 
    fatty_acid = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "14:0", class = "factor"), 
    quant = c(6.16, 6.415, 4.02, 4.05, 4.62, 4.435, 3.755, 3.755
    )), .Names = c("group", "treatment", "replicate", "fatty_acid_family", 
"fatty_acid", "quant"), class = "data.frame", row.names = c(NA, 
-8L))

I have tried using dplyr as follows:

group_by(dataIn, replicate, group) %>% transmute(ratio = quant[group=="case"]/quant[group=="controls"])

but this results in Error: incompatible size (%d), expecting %d (the group size) or 1

Initially I thought this might be because I was trying to create 4 ratios from a df 8 rows deep and so I thought summarise might be the answer (collapsing each group to one ratio) but that doesn't work either (my understanding is a shortcoming).

group_by(dataIn, replicate, group) %>% summarise(ratio = quant[group=="case"]/quant[group=="controls"])

  replicate    group ratio
1      four     case    NA
2      four controls    NA
3       one     case    NA
4       one controls    NA
5     three     case    NA
6     three controls    NA
7       two     case    NA
8       two controls    NA

I would appreciate some advice on where I'm going wrong or even if this can be done with dplyr.

Thanks.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can try:

group_by(dataIn, replicate) %>% 
    summarise(ratio = quant[group=="case"]/quant[group=="controls"])
#Source: local data frame [4 x 2]
#
#  replicate    ratio
#1      four 1.078562
#2       one 1.333333
#3     three 1.070573
#4       two 1.446449

Because you grouped by replicate and group, you could not access data from different groups at the same time.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...