Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
339 views
in Technique[技术] by (71.8m points)

r - Efficient row-wise operations on a data.table

I need to find the row-wise minimum of many (+60) relatively large data.frame (~ 250,000 x 3) (or I can equivalently work on an xts).

set.seed(1000)
my.df <- sample(1:5, 250000*3, replace=TRUE)
dim(my.df) <- c(250000,3)
my.df <- as.data.frame(my.df)
names(my.df) <- c("A", "B", "C")

The data frame my.df looks like this

> head(my.df)

  A B C
1 2 5 2
2 4 5 5
3 1 5 3
4 4 4 3
5 3 5 5
6 1 5 3

I tried

require(data.table)
my.dt <- as.data.table(my.df)

my.dt[, row.min:=0]  # without this: "Attempt to add new column(s) and set subset of rows at the same time"
system.time(
  for (i in 1:dim(my.dt)[1]) my.dt[i, row.min:= min(A, B, C)]
)

On my system this takes ~400 seconds. It works, but I am not confident it is the best way to use data.table. Am I using data.table correctly? Is there a more efficient way to do simple row-wise opertations?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Or, just pmin.

my.dt <- as.data.table(my.df)
system.time(my.dt[,row.min:=pmin(A,B,C)])
# user  system elapsed 
# 0.02    0.00    0.01 
head(my.dt)
#      A B C row.min
# [1,] 2 5 2       2
# [2,] 4 5 5       4
# [3,] 1 5 3       1
# [4,] 4 4 3       3
# [5,] 3 5 5       3
# [6,] 1 5 3       1

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...