Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
303 views
in Technique[技术] by (71.8m points)

r - Delete columns where all values are 0

I have a numeric matrix with 15000 columns. I want to completely remove the columns where all values are 0.

     col1     col2     col3     col4
row1  1        0        0        1
row2  3.4      0        0        2.4
row3  0.56     0        0        0
row4  0        0        0        0
 

Here I want to delete columns col2 and col3, and keep the rest. How can I do it with R? Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

A quicker way to do the same (3 - 5x faster) would be

M[,colSums(M^2) !=0]

EDIT: Added timing details of various approaches suggested here. The approach suggested by @Dwin using M[, colSums(abs(M)) ! == 0] seems to work fastest, especially when the matrix is large. I will update the benchmarking report if other solutions are suggested.

m <- cbind(rnorm(1000),0)
M <- matrix(rep(m,7500), ncol=15000)

f_joran   = function(M) M[, !apply(M==0,2,all)]
f_ramnath = function(M) M[, colSums(M^2) != 0]
f_ben     = function(M) M[, colSums(M==0) != ncol(M)]
f_dwin    = function(M) M[, colSums(abs(M)) != 0]

library(rbenchmark)
benchmark(f_joran(M), f_ramnath(M), f_ben(M), f_dwin(M), 
   columns = c('test', 'elapsed', 'relative'), 
   order = 'relative', replications = 10)


          test elapsed relative
4    f_dwin(M)  11.699 1.000000
2 f_ramnath(M)  12.056 1.030515
1   f_joran(M)  26.453 2.261133
3     f_ben(M)  28.981 2.477220

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...