Unfortunately the accepted answer is wrong when y
is a matrix of more than one row. The error is in the line
vy <- rowSums( w * y * y )
We want to multiply the columns of y
by w
, but this will multiply the rows by the elements of w
, recycled as necessary. Thus
> f(x, y[1, , drop = FALSE], xy.wt)
[1] 0.103021
is correct, because in this case the multiplication is performed element-wise, which is equivalent to column-wise multiplication here, but
> f(x, y, xy.wt)[1]
[1] 0.05463575
gives a wrong answer due to the row-wise multiplication.
We can correct the function as follows
f2 <- function( x, y, w = rep(1,length(x))) {
stopifnot(length(x) == dim(y)[2] )
w <- w / sum(w)
# Center x and y, using the weighted means
x <- x - sum(x * w)
ty <- t(y - colSums(t(y) * w))
# Compute the variance
vx <- sum(w * x * x)
vy <- colSums(w * ty * ty)
# Compute the covariance
vxy <- colSums(ty * x * w)
# Compute the correlation
vxy / sqrt(vx * vy)
}
and check the results against those produced by corr
from the boot
package:
> res1 <- f2(x, y, xy.wt)
> res2 <- sapply(1:nrow(y),
+ function(i, x, y, w) corr(cbind(x, y[i,]), w = w),
+ x = x, y = y, w = xy.wt)
> all.equal(res1, res2)
[1] TRUE
which in itself gives another way that this problem could be solved.