I have a data table in R:
library(data.table)
set.seed(1234)
DT <- data.table(x=rep(c(1,2,3),each=4), y=c("A","B"), v=sample(1:100,12))
DT
x y v
[1,] 1 A 12
[2,] 1 B 62
[3,] 1 A 60
[4,] 1 B 61
[5,] 2 A 83
[6,] 2 B 97
[7,] 2 A 1
[8,] 2 B 22
[9,] 3 A 99
[10,] 3 B 47
[11,] 3 A 63
[12,] 3 B 49
I can easily sum the variable v by the groups in the data.table:
out <- DT[,list(SUM=sum(v)),by=list(x,y)]
out
x y SUM
[1,] 1 A 72
[2,] 1 B 123
[3,] 2 A 84
[4,] 2 B 119
[5,] 3 A 162
[6,] 3 B 96
However, I would like to have the groups (y) as columns, rather than rows. I can accomplish this using reshape
:
out <- reshape(out,direction='wide',idvar='x', timevar='y')
out
x SUM.A SUM.B
[1,] 1 72 123
[2,] 2 84 119
[3,] 3 162 96
Is there a more efficient way to reshape the data after aggregating it? Is there any way to combine these operations into one step, using the data.table operations?
Question&Answers:
os