Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
399 views
in Technique[技术] by (71.8m points)

Error in eval(predvars, data, env) : object ' ' not found in R pls()

I've seen this question come up a lot but have yet to find a satisfactory solution, particularly for my case.

I am running partial least squares regression in R using pls() package, and would then like to calculate root mean square error of prediction using RMSEP() on newdata using the fitted model. This throws up the error, and I believe it is specifically because I am coding the function as follows:

plsr( Y ~ X[whatever , whatever ] ...

where I need to index specific parts of dataframe$X. Here is an example:

library(pls)

gasoline <- gasoline

#Split dataframe between training and testing data
set.seed(123)
split <- sample.split(gasoline$octane, SplitRatio = 0.70)

gasoline$train <- split

gas.fit <- plsr(octane ~ NIR[ ,1:10] + NIR[ ,20:30],
                        ncomp = 10, 
                        data = gasoline[gasoline$train ,],  
                        validation = "LOO", 
                        scale = FALSE, 
                        center = TRUE,
                        method = "simpls"
)

#I can use RMSEP() on the fitted model
RMSEP(gas.fit)

#I can use the fitted model to predict octane of my test set
predict(gas.fit, newdata = gasoline[!gasoline$train ,])  

#But I cannot get the RMSEP of the test predictions
RMSEP(gas.fit, estimate = "test", newdata = gasoline[!gasoline$train ,])

This last command throws up the error:

Error in eval(predvars, data, env) : object 'NIR' not found

What I know: I know the object 'NIR' should be present, since I've opted to combine train and test data into a single dataframe.

RMSEP() function works fine on models of style "plsr( Y ~ X[whatever , whatever ]" as long as you don't call newdata. predict() function works fine in both cases.

What I've tried: Mevik & Wehrens (2007) insist we use the format

plsr( octane ~ NIR,
...
data = gasoline
...)

and not

plsr( gasoline$octane ~ gasoline$NIR,

which is more akin to what I am doing in my example, but not exactly the same. Even so, I've tried the following adjustment:

gas.fit <- plsr(octane ~ NIR,
                        ncomp = 10, 
                        data = c(
              gasoline[gasoline$train ,]$NIR[ , 1:10],gasoline[gasoline$train ,]$NIR[ ,20:30]
                        ),  
                        validation = "LOO", 
                        scale = FALSE, 
                        center = TRUE,
                        method = "simpls"
)

But this is no good either ('envir' not of length one); also it means I have to include an additional gasoline$octane as well which further violates the length criterion.

I'd really like to find a solution to this approach as my end use goal is to include the plsr() model in a for() loop of the style:

gas.fit <- plsr(octane ~ NIR[ ,i:(i+20)],

as part of a Moving Window PLSR algorithm.

question from:https://stackoverflow.com/questions/65836553/error-in-evalpredvars-data-env-object-not-found-in-r-pls

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...