I want to parellelize my code so that I can utilize all the cores. Therefore, I want to replace the for loop with foreach loop. As I am begginner to R, I could not understand how diferent posts on this topic address the issue. It will be great if somebody can help me with it in step-by-step manner (posting comments with each line, so that I can understand it). Below is my for loop, that I want to replace with foreach:
# A function used for Janshon-Shanon-Divergence computation, that I use inside my nested for loop
JensShanDiver = function(a,b) {
m = 0.5 * (a + b)
LRa = ifelse(a > 0, log2(a/m), 0)
LRb = ifelse(b > 0, log2(b/m), 0)
JSD = 0.5 * (sum(a * LRa) + sum(b * LRb))
return(JSD)
}
#an empty dataframe having same dimensions as input dataframe
output <- data.frame(matrix(NA, nrow = nrow(input), ncol = ncol(input)))
#a vector of same length as of each row in input dataframe
v2 <- numeric(length(input[1,]))
for (j in 1:nrow(input)){
#take each row from input df
v1 <- as.numeric(input[j,])
for(i in 1:length(v1)){
# update an index value in the initially defined vector
v2[i] <- 1
# Take the sum of both vectors
ifelse(v1[i] == 0, output_vec <- 1, output_vec <- JensShanDiver(v1, v2))
# Reset the updated index to 0 again
v2[i] <- 0
# write the output value at [j,i]th index in the output dataframe
output[j,i] <- output_vec
}
}
Sample of input dataframe is given below:
dput(input)
structure(c(0, 0.5, 0.5, 1, 0.333333333333333, 0.333333333333333,
0.333333333333333, 0, 0, 1, 0, 0.5, 0.5, 0, 0.333333333333333,
0.333333333333333, 0.333333333333333, 0.5, 0.5, 0, 1, 0, 0, 0,
0.333333333333333, 0.333333333333333, 0.333333333333333, 0.5,
0.5, 0), .Dim = c(10L, 3L), .Dimnames = list(NULL, c("ranges_in_X51214",
"ranges_in_X56499", "ranges_in_X6383")))
Here is the expected output for the given input:
> dput(output)
structure(list(X1 = c(1, 0.311278124459133, 0.311278124459133,
0, 0.459147917027245, 0.459147917027245, 0.459147917027245, 1,
1, 0), X2 = c(1, 0.311278124459133, 0.311278124459133, 1, 0.459147917027245,
0.459147917027245, 0.459147917027245, 0.311278124459133, 0.311278124459133,
1), X3 = c(0, 1, 1, 1, 0.459147917027245, 0.459147917027245,
0.459147917027245, 0.311278124459133, 0.311278124459133, 1)), .Names = c("X1",
"X2", "X3"), row.names = c(NA, 10L), class = "data.frame")
Your help will be much appreciated.
See Question&Answers more detail:
os