Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
383 views
in Technique[技术] by (71.8m points)

r - Vectorised Rcpp random binomial draws

This is a follow-on question from this one: Generating same random variable in Rcpp and R

I'm trying to speed up a vectorised call to rbinom of this form:

    x <- c(0.1,0.4,0.6,0.7,0.8)
    rbinom(length(x),1 ,x)

In the live code of x is a vector of variable length (but typically numbering in the millions). I have no experience with Rcpp but I was wondering could I use Rcpp to speed this up. From the linked question this Rcpp code was suggested for non-vectorised rbinom calls by @Dirk Eddelbuettel :

    cppFunction("NumericVector cpprbinom(int n, double size, double prob) { 
         return(rbinom(n, size, prob)); }")
    set.seed(42); cpprbinom(10, 1, 0.5)

....and is about twice as fast as the non Rcpp option, but can't handle my vectorised version

    cpprbinom(length(x), 1, x)

How can the Rcpp code be modified to implement this?

Thanks

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Following Dirk's response here:

Is there a way of fixing the code without using an explicit loop in the C++ code?

I don't think so. The code currently has this hard-wired: <...> so until one of us has sufficient [time] to extend this (and test it) will have to do the loop at your end.

Here's my implementation of a "vectorised" code:

library(Rcpp)
cppFunction("NumericVector cpprbinom(int n, double size, NumericVector prob) { 
    NumericVector v(n);            
    for (int i=0; i<n; i++) {v[i] = as<double>(rbinom(1, size, prob[i]));} 
    return(v); }")
r <- runif(1e6)
all.equal({set.seed(42); rbinom(length(r), 1, r)}, 
          {set.seed(42); cpprbinom(length(r), 1, r)})
#TRUE

But the problem is (again citing Dirk),

And I suggest that before expending a lot of effort on this you check whether you are likely to do better than the R function rbinom. That R function is vectorized in C code and you are unlikely to make things much faster by using Rcpp, unless you want to use the random variates in another C++ function.

And it is actually slower (x3 on my machine), so at least such naive implementation as mine won't help:

library(microbenchmark)
microbenchmark(rbinom(length(r), 1, r), cpprbinom(length(r), 1, r))

Unit: milliseconds
                       expr       min        lq      mean    median        uq       max neval
    rbinom(length(r), 1, r)  55.50856  56.09292  56.49456  56.45297  56.65897  59.42524   100
 cpprbinom(length(r), 1, r) 117.63761 153.37599 154.94164 154.29623 155.37247 225.56535   100

EDIT: according to Romain's comment below, here's an advanced version, which is faster!

cppFunction(plugins=c("cpp11"), "NumericVector cpprbinom2(int n, double size, NumericVector prob) { 
    NumericVector v = no_init(n);
    std::transform( prob.begin(), prob.end(), v.begin(), [=](double p){ return R::rbinom(size, p); }); 
    return(v);}")
r <- runif(1e6)
all.equal({set.seed(42); rbinom(length(r), 1, r)}, 
          {set.seed(42); cpprbinom(length(r), 1, r)}, 
          {set.seed(42); cpprbinom2(length(r), 1, r)})
#TRUE
microbenchmark(rbinom(length(r), 1, r), cpprbinom(length(r), 1, r), cpprbinom2(length(r), 1, r))

Unit: milliseconds
                        expr       min        lq      mean    median        uq       max neval
     rbinom(length(r), 1, r)  55.26412  56.00314  56.57814  56.28616  56.59561  60.01861   100
  cpprbinom(length(r), 1, r) 113.72513 115.94758 122.81545 117.24708 119.95134 168.47246   100
 cpprbinom2(length(r), 1, r)  36.67589  37.12182  38.95318  37.37436  37.97719  84.73516   100

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...