r - Why is vectorization faster

Question

Welcome To Ask or Share your Answers For Others

r - Why is vectorization faster

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - Why is vectorization faster

I've been learning R for a while now, and have come across a lot of advice to programming types like myself to vectorize operations. Being a programmer, I'm interested as to why / how it's faster. An example:

n = 10^7
# populate with random nos
v=runif(n)
system.time({vv<-v*v; m<-mean(vv)}); m
system.time({for(i in 1:length(v)) { vv[i]<-v[i]*v[i] }; m<-mean(vv)}); m

This gave

   user  system elapsed 
   0.04    0.01    0.07 
[1] 0.3332091

   user  system elapsed 
  36.68    0.02   36.69 
[1] 0.3332091

The most obvious thing to consider is that we're running native code, i.e. machine code compiled from C or C++, rather than interpreted code, as shown by the massive difference in user time between the two examples (circa 3 orders of magnitude). But is there anything else going on? For example, does R do:

Cunning native data structures, e.g. clever ways of storing sparse vectors or matrices so that we only do multiplications when we need to?
Lazy evaluation, e.g. on a matrix multiply, don't evaluate cells until as and when you need to.
Parallel processing.
Something else.

To test whether there might be some sparse vector optimization I tried doing dot products with difference vector contents

# populate with random nos
v<-runif(n)
system.time({m<-v%*%v/n}); m
# populate with runs of 1 followed by 99 0s
v <-rep(rep(c(1,rep(0,99)),n/100))
system.time({m<-v%*%v/n}); m
# populate with 0s
v <-rep(0,n)
system.time({m<-v%*%v/n}); m

However there was no significant difference in time (circa 0.09 elapsed)

(Similar question for Matlab: Why does vectorized code run faster than for loops in MATLAB?)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T17:53:51+0000

The most obvious thing to consider is that we're running native code, i.e. machine code compiled from C or C++, rather than interpreted code

That's most of it. The other big-ish component is that since R code is functional in its design paradigm, functions (attempt to) have no side effects, which means that in some (but perhaps not all; R does try to be efficient about this) instances calling [<- in side a for loop results in having to copy the entire object. That can get slow.

A small side note: R does have rather extensive functionality for handling sparse matrix structures efficiently, but they aren't the "default".

Categories

r - Why is vectorization faster

r - Why is vectorization faster

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags