Most of the questions about merging data.frame in lists on SO don't quite relate to what I'm trying to get across here, but feel free to prove me wrong.
I have a list of data.frames. I would like to "rbind" rows into another data.frame by row. In essence, all first rows form one data.frame, second rows second data.frame and so on.
Result would be a list of the same length as the number of rows in my original data.frame(s). So far, the data.frames are identical in dimensions.
Here's some data to play around with.
sample.list <- list(data.frame(x = sample(1:100, 10), y = sample(1:100, 10), capt = sample(0:1, 10, replace = TRUE)),
data.frame(x = sample(1:100, 10), y = sample(1:100, 10), capt = sample(0:1, 10, replace = TRUE)),
data.frame(x = sample(1:100, 10), y = sample(1:100, 10), capt = sample(0:1, 10, replace = TRUE)),
data.frame(x = sample(1:100, 10), y = sample(1:100, 10), capt = sample(0:1, 10, replace = TRUE)),
data.frame(x = sample(1:100, 10), y = sample(1:100, 10), capt = sample(0:1, 10, replace = TRUE)),
data.frame(x = sample(1:100, 10), y = sample(1:100, 10), capt = sample(0:1, 10, replace = TRUE)),
data.frame(x = sample(1:100, 10), y = sample(1:100, 10), capt = sample(0:1, 10, replace = TRUE)))
Here's what I've come up with with the good ol' for loop.
#solution 1
my.list <- vector("list", nrow(sample.list[[1]]))
for (i in 1:nrow(sample.list[[1]])) {
for (j in 1:length(sample.list)) {
my.list[[i]] <- rbind(my.list[[i]], sample.list[[j]][i, ])
}
}
#solution 2 (so far my favorite)
sample.list2 <- do.call("rbind", sample.list)
my.list2 <- vector("list", nrow(sample.list[[1]]))
for (i in 1:nrow(sample.list[[1]])) {
my.list2[[i]] <- sample.list2[seq(from = i, to = nrow(sample.list2), by = nrow(sample.list[[1]])), ]
}
Can this be improved using vectorization without much brainhurt? Correct answer will contain a snippet of code, of course. "Yes" as an answer doesn't count.
EDIT
#solution 3 (a variant of solution 2 above)
ind <- rep(1:nrow(sample.list[[1]]), times = length(sample.list))
my.list3 <- split(x = sample.list2, f = ind)
BENCHMARKING
I've made my list larger with more rows per data.frame. I've benchmarked the results which are as follows:
#solution 1
system.time(for (i in 1:nrow(sample.list[[1]])) {
for (j in 1:length(sample.list)) {
my.list[[i]] <- rbind(my.list[[i]], sample.list[[j]][i, ])
}
})
user system elapsed
80.989 0.004 81.210
# solution 2
system.time(for (i in 1:nrow(sample.list[[1]])) {
my.list2[[i]] <- sample.list2[seq(from = i, to = nrow(sample.list2), by = nrow(sample.list[[1]])), ]
})
user system elapsed
0.957 0.160 1.126
# solution 3
system.time(split(x = sample.list2, f = ind))
user system elapsed
1.104 0.204 1.332
# solution Gabor
system.time(lapply(1:nr, bind.ith.rows))
user system elapsed
0.484 0.000 0.485
# solution ncray
system.time(alply(do.call("cbind",sample.list), 1,
.fun=matrix, ncol=ncol(sample.list[[1]]), byrow=TRUE,
dimnames=list(1:length(sample.list),names(sample.list[[1]]))))
user system elapsed
11.296 0.016 11.365
See Question&Answers more detail:
os