I'd like to perform a nonlinear regression for dimensionality reduction with a dataset that has more predictors than observations, and predictors can also be multicollinear [edit: it is similar to a gene expression data set]. What I have found by googling is that a GAM model with smoothing function + using L1 penalty could do the job, however when I try to implement such a model using the R package in mgcv
I get early on the Error: model has more coefficients than data
.
After reading the answer to this question I assume that I cannot calculate a GAM with more predictors than observations using mgcv
. Can someone point me in the direction which package is suitable for my quest, or if I have made a mistake with my code?
Here is an example code of what I have tried and that gives the same error. Note that my "real" dataset has p>n
[edit: and all variables are numeric]
library(mgcv)
set.seed(2)
dat <- gamSim(7, n=40, scale=2) #get some example data
colnames(dat)
#"y" "x0" "x1" "x2" "x3" "f" "f0" "f1" "f2" "f3"
b <- gam(y ~ s(x0)+s(x1)+s(x2)+s(x3)+s(f)+s(f0)+s(f1)+s(f2),
data=dat, select= T)
summary(b)
#error: model has more coefficients than data
question from:
https://stackoverflow.com/questions/66064577/error-nonlinear-regression-model-in-r-that-has-more-predictors-than-observation 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…