Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
447 views
in Technique[技术] by (71.8m points)

r - Apply a function to a subset of data.table columns, by column-indices instead of name

I'm trying to apply a function to a group of columns in a large data.table without referring to each one individually.

a <- data.table(
  a=as.character(rnorm(5)),
  b=as.character(rnorm(5)),
  c=as.character(rnorm(5)),
  d=as.character(rnorm(5))
)
b <- c('a','b','c','d')

with the MWE above, this:

a[,b=as.numeric(b),with=F]

works, but this:

a[,b[2:3]:=data.table(as.numeric(b[2:3])),with=F]

doesn't work. What is the correct way to apply the as.numeric function to just columns 2 and 3 of a without referring to them individually.

(In the actual data set there are tens of columns so it would be impractical)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The idiomatic approach is to use .SD and .SDcols

You can force the RHS to be evaluated in the parent frame by wrapping in ()

a[, (b) := lapply(.SD, as.numeric), .SDcols = b]

For columns 2:3

a[, 2:3 := lapply(.SD, as.numeric), .SDcols = 2:3]

or

mysubset <- 2:3
a[, (mysubset) := lapply(.SD, as.numeric), .SDcols = mysubset]

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...