Subset by Chi Sq Values from GAM results in R

Question

Welcome To Ask or Share your Answers For Others

Subset by Chi Sq Values from GAM results in R

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

Subset by Chi Sq Values from GAM results in R

I've run a for-loop in R that generates models for a binomial GAM for 200 different random data combinations (200 different set.seed values).

The for-loop and GAMs run just fine, and they store the models in the appropriate list, model[[i]], with each list element representing a model for a certain set.seed iteration.

I can run summary() on an individual list element (model[[5]], for example) and get something like this:

Approximate significance of smooth terms:
                      edf Ref.df Chi.sq  p-value    
s(Petal.Width)  1.133e+00      9  5.414 0.008701 ** 
s(Sepal.Length) 8.643e-01      9  2.338 0.067509 .  

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.361   Deviance explained = 32.7%
-REML = 83.084  Scale est. = 1         n = 160

Since I've got 200 of these elements in my model list, I was wondering if there's a quick way to go through and count how many times (out of the total 200) that Chi.sq value is equal to 0 for the Petal.Width variable. Basically, since I have the GAMs set up with bs = "cs", the number of times that Chi.sq is equal to 0 represents how often the variable-selection process removed that variable from the model.

Here's a cleaned-up version of the code I used for the for-loop to build the model:


a <- c(1:200)
model <- list()


for (i in a) {
  
#In my version there's some extra code here that subsets iris based on i 
  
  model[[i]] <- gam(Species~ s(Petal.Width, bs="cs") + 
                      s(Sepal.Length, bs="cs"),
                    data = iris,
                    family = binomial,
                    method = "REML")

}

question from:https://stackoverflow.com/questions/65926846/subset-by-chi-sq-values-from-gam-results-in-r

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T19:06:58+0000

I've found the tidy() function from the broom package to be helpful in this type of situation.
here's your model (I cut down to number of iterations to 10 to make it go faster,
also it was throwing warnings using mgcv::gam())

a <- c(1:10)
model <- list()

for (i in a) {
  model[[i]] <- gam(Species~ s(Petal.Width, bs="cs") + 
                      s(Sepal.Length, bs="cs"),
                    data = iris,
                    family = binomial,
                    method = "REML")
}

If you needed or wanted the list of all the models, then this second loop will extract the statistic from each model and combine them into a dataframe:

results_df <- data.frame(term=c("Petal.Width","Sepal.Length"))
for(m in model){
  results_df <- bind_cols(results_df, model=tidy(m)$statistic)
}
results_df

with output:

          term    model...2    model...3    model...4    model...5    model...6
1  Petal.Width 1.304372e-09 1.304372e-09 1.304372e-09 1.304372e-09 1.304372e-09
2 Sepal.Length 3.844738e-16 3.844738e-16 3.844738e-16 3.844738e-16 3.844738e-16
     model...7    model...8    model...9   model...10   model...11
1 1.304372e-09 1.304372e-09 1.304372e-09 1.304372e-09 1.304372e-09
2 3.844738e-16 3.844738e-16 3.844738e-16 3.844738e-16 3.844738e-16

you can rename the columns to make them nicer, etc.
also you could combine the modeling and the combining of the statistics if that's all you want to keep.

Categories

Subset by Chi Sq Values from GAM results in R

Subset by Chi Sq Values from GAM results in R

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags