A loop at the R level is not vectorized. An R loop will be calling the same R code for each element of a vector, which will be inefficient. Vectorized functions usually refer to those that take a vector and operate on the entire vector in an efficient way. Ultimately this will involve some for of loop, but as that loop is being performed in a low-level language such as C it can be highly efficient and tailored to the particular task.
Consider this silly function to add pairwise the elements of two vectors
sillyplus <- function(x, y) {
out <- numeric(length = length(x))
for(i in seq_along(x)) {
out[i] <- x[i] + y[i]
}
out
}
It gives the right result
R> sillyplus(1:10, 1:10)
[1] 2 4 6 8 10 12 14 16 18 20
and is vectorised in the sense that it can operate on entire vectors at once, but it is not vectorised in the sense I describe above because it is exceptionally inefficient. +
is vectorised at the C level in R so we really only need 1:10 + 1:10
, not an explicit loop in R.
The usual way to write a vectorised function is to use existing R functions that are already vectorised. If you want to start from scratch and the thing you want to do with the function doesn't exist as a vectorised function in R (odd, but possible) then you will need to get your hands dirty and write the guts of the function in C and prepare a little wrapper in R to call the C function you wrote with the vector of data you want it to work on. There are ways with functions like Vectorize()
to fake vectorisation for R functions that are not vectorised.
C is not the only option here, FORTRAN is a possibility as is C++ and, thanks to Dirk Eddelbuettel & Romain Francois, the latter is much easier to do now with the rcpp package.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…