r - mgcv: How to set number and / or locations of knots for splines

Question

Welcome To Ask or Share your Answers For Others

r - mgcv: How to set number and / or locations of knots for splines

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

r - mgcv: How to set number and / or locations of knots for splines

I want to use function gam in mgcv packages:

 x <- seq(0,60, len =600)
 y <- seq(0,1, len=600) 
 prova <- gam(y ~ s(x, bs='cr')

can I set the number of knots in s()? and then can I know where are the knots that the spline used? Thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T17:46:36+0000

While setting k is the correct way to go, fx = TRUE is definitely not right: it will force using pure regression spline without penalization.

locations of knots

For penalized regression spline, the exact locations are not important, as long as:

k is adequately big;
the spread of knots has good, reasonable coverage.

By default:

natural cubic regression spline bs = 'cr' places knots by quantile;
B-splines family (bs = 'bs', bs = 'ps', bs = 'ad') place knots evenly.

Compare the following:

library(mgcv)

## toy data
set.seed(0); x <- sort(rnorm(400, 0, pi))  ## note, my x are not uniformly sampled
set.seed(1); e <- rnorm(400, 0, 0.4)
y0 <- sin(x) + 0.2 * x + cos(abs(x))
y <- y0 + e

## fitting natural cubic spline
cr_fit <- gam(y ~ s(x, bs = 'cr', k = 20))
cr_knots <- cr_fit$smooth[[1]]$xp  ## extract knots locations

## fitting B-spline
bs_fit <- gam(y ~ s(x, bs = 'bs', k = 20))
bs_knots <- bs_fit$smooth[[1]]$knots  ## extract knots locations

## summary plot
par(mfrow = c(1,2))
plot(x, y, col= "grey", main = "natural cubic spline");
lines(x, cr_fit$linear.predictors, col = 2, lwd = 2)
abline(v = cr_knots, lty = 2)
plot(x, y, col= "grey", main = "B-spline");
lines(x, bs_fit$linear.predictors, col = 2, lwd = 2)
abline(v = bs_knots, lty = 2)

You can see the difference in knots placement.

Setting your own knots locations:

You can also provide your customized knots locations via the knots argument of gam() (yes, knots are not fed to s(), but to gam()). For example, you can do evenly spaced knots for cr:

xlim <- range(x)  ## get range of x
myfit <- gam(y ~ s(x, bs = 'cr', k =20),
         knots = list(x = seq(xlim[1], xlim[2], length = 20)))

Now you can see that:

my_knots <- myfit$smooth[[1]]$xp
plot(x, y, col= "grey", main = "my knots");
lines(x, myfit$linear.predictors, col = 2, lwd = 2)
abline(v = my_knots, lty = 2)

However, there is usually no need to set knots yourself. But if you do want to do this, you must be clear what you are doing. Also, the number of knots you provided must match k in the s().

Categories

r - mgcv: How to set number and / or locations of knots for splines

r - mgcv: How to set number and / or locations of knots for splines

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags