Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
218 views
in Technique[技术] by (71.8m points)

r - Efficient way compare data in matrix

I have the following issue. I have to compare data in a matrix (datatable or dataframe) using the following function:

Q <- function(j){
        
        # j = (1:dim(x)[1])[1]
        Q1 <- c()
        for(i in 2:ncol(x)){
        # i = 2
        item <- x[j,1]
        indices <- which(x[,i] == item)
     
        items <- x[1:indices, i]
        Q1 <- c(Q1, items)
     
        }
   
     return(Q1)
   
  }

For example with a data like this:

tablero <- data.frame(t1 = c(1,2,3),
                      t2 = c(3,1,2),
                      t3 = c(3,2,1))

I get this output:

Q(1) = 3 1 3 2 1
Q(2) = 3 1 2 3 2
Q(3) = 3 3

The thing is that i have a big matrix with of 50.000 rows and 7 columns and that function is too slow and use a lot of memory. Is there a more optimal way in memory use and speed to do that same thing?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

We can use the sapply and apply functions to loop through your data. This may be more performant than your function which relies on a for loop. The code returns a list.

sapply(tablero[,1],
       FUN = function(x){
         unlist(apply(tablero[,-1],
                      2,
                      FUN = function(y) y[1:which(y == x)]))
       })

# [[1]]
# t21 t22 t31 t32 t33 
# 3   1   3   2   1 

# [[2]]
# t21 t22 t23 t31 t32 
# 3   1   2   3   2 
 
# [[3]]
# t2 t3 
# 3  3 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...