I want to run the below loop in an efficient way as I need to perform this on millions of rows.
Sample data
a <- data.frame(x1=rep(c('a','b','c','d'),5),
x2=c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5),
value1=c(rep(201,4),rep(202,4),rep(203,4),rep(204,4),rep(205,4)),
y1=c(rep('a',4),rep('b',4),rep('c',4),rep('d',4),rep('e',4)),
y2=c(1,2,3,4,2,3,4,5,3,4,5,6,4,5,6,7,5,6,7,8),
value2=seq(101,120), stringsAsFactors = FALSE)
I wrote below to compare similar values between two columns and then find the difference.
for (i in 1:length(a$x1)){
for (j in 1:length(a$x1)){
if(a$y1[i] == a$x1[j] & a$y2[i] == a$x2[j]){
a$diff[i] <- a$value1[j] - a$value2[i]
break
}
}
}
See Question&Answers more detail:
os