Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
398 views
in Technique[技术] by (71.8m points)

r - Subtract pairs of columns based on matching column

I'll apologise in advance - I know this has likely been answered elsewhere, but I don't seem to be able to find the answer I need, and can't manage to adapt other code I have found to my needs.

I have a data frame:

FILE | TECHNIQUE | COUNT
------------------------
A    | ONE       | 10
A    | TWO       | 25
B    | ONE       |  5
B    | TWO       | 30
C    | ONE       | 30
C    | TWO       | 50

I would like to produce a data frame of the difference of the COUNT values between ONE and TWO, with a row for each FILE, i.e.

FILE | DIFFERENCE
-----------------
A    |   15
B    |   25
C    |   20

I'm convinced I should be able to do this fairly easily with base R or Plyr, but am a bit stuck. Could anyone suggest a good way to do this, and perhaps good tutorials on Plyr that might help me with similar problems in the future?

Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Using aggregate in base:

> aggregate(.~FILE, data= DF[, -2], FUN=diff)
  FILE COUNT
1    A    15
2    B    25
3    C    20

Using ddply in plyr

> ddply(DF[,-2], .(FILE), summarize, DIFFERENCE=diff(COUNT))
  FILE DIFFERENCE
1    A         15
2    B         25
3    C         20

with data.table

> # library(data.table)
> DT <- data.table(DF)
> DT[, diff(COUNT), by=FILE]
   FILE V1
1:    A 15
2:    B 25
3:    C 20

with by

> with(DF, by(COUNT, FILE, diff))
FILE: A
[1] 15
----------------------------------------------------------------------------- 
FILE: B
[1] 25
----------------------------------------------------------------------------- 
FILE: C
[1] 20

with tapply

> tapply(DF$COUNT, DF$FILE, diff)
 A  B  C 
15 25 20 

with summaryBy from doBy package

> # library(doBy)
> summaryBy(COUNT~FILE, FUN=diff, data=DF)
  FILE COUNT.diff
1    A         15
2    B         25
3    C         20

Update As percentage:

> aggregate(.~FILE, data= DF[, -2], function(x) (x[1]/x[2])*100)
  FILE    COUNT
1    A 40.00000
2    B 16.66667
3    C 60.00000

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...