In R mathematical operations are vectorized. In other words you do not need to apply the same mathematical operation on each vector element separately, you can perform it on all vector elements simultaneously.
The function
my.kernel <- function(Yt){
for (i in 1:length(Yt)) {
Yt[i] <- ifelse(abs(Yt[i]) <= 1, (35/32)*(1 - Yt[i]^2)^3, 0)}
Yt}
can be rewritten as
my.kernel.vec <- function(x) ifelse(abs(x) <= 1, (35/32)*(1 - x^2)^3, 0)
Yt <- seq(0, 10, 0.01)
h <- 1
y <- 1
all.equal(my.kernel(Yt),
my.kernel.vec(Yt))
#output
TRUE
the difference in speed is not minor:
library(microbenchmark)
microbenchmark(my.kernel(Yt),
my.kernel.vec(Yt))
Unit: microseconds
expr min lq mean median uq max neval cld
my.kernel(Yt) 1110.8 1179.2 1438.136 1311.35 1708.9 6756.4 100 b
my.kernel.vec(Yt) 54.3 66.3 104.204 70.20 74.3 3495.4 100 a
That is quite of a speed up.
Similarly
my.kernel.density.estimator <- function(y,Yt,h){
result <- 0
for(i in 1:length(Yt)){
result <- result + (1/(length(Yt)*h))*my.kernel((Yt[i]-y)/h)}
result}
can be changed to utilize R vectorized operations
my.kernel.density.estimator.vec <- function(y,Yt,h) sum((1/(length(Yt)*h))*my.kernel.vec((Yt-y)/h))
all.equal(my.kernel.density.estimator.vec(1, Yt, 1),
my.kernel.density.estimator(1, Yt, 1))
#output
TRUE
microbenchmark(my.kernel.density.estimator.vec(1, Yt, 1),
my.kernel.density.estimator(1, Yt, 1))
Unit: microseconds
expr min lq mean median uq max neval cld
my.kernel.density.estimator.vec(1, Yt, 1) 57.8 59.6 101.918 63.10 70.25 3716.4 100 a
my.kernel.density.estimator(1, Yt, 1) 2110.8 2163.6 2285.316 2231.35 2283.20 7826.7 100 b
Finally in
my.loglik.cv <- function(Yt,h){
result <- 0
for(i in 1:length(Yt)){
result <- result + log(my.kernel.density.estimator(Yt[i],Yt[-i],h))}
result}
you need to loop in order to create vectors Yt[i]
and Yt[-i]
so I left it as is.
microbenchmark(my.loglik.cv.vec(Yt, 1),
my.loglik.cv(Yt, 1), times = 10)
Unit: milliseconds
expr min lq mean median uq max neval cld
my.loglik.cv.vec(Yt, 1) 59.1957 59.6794 79.13856 90.46365 92.7877 93.4487 10 a
my.loglik.cv(Yt, 1) 2240.7176 2280.7737 2309.83982 2299.39885 2343.6714 2412.8111 10 b
Not to mention the speedup on larger vectors:
Yt <- seq(0, 10, 0.001)
microbenchmark(my.loglik.cv.vec(Yt, 1),
my.loglik.cv(Yt, 1), times = 1)
Unit: seconds
expr min lq mean median uq max neval
my.loglik.cv.vec(Yt, 1) 5.460431 5.460431 5.460431 5.460431 5.460431 5.460431 1
my.loglik.cv(Yt, 1) 230.221194 230.221194 230.221194 230.221194 230.221194 230.221194 1