No, No, No, No, No, No! Do not use the formula interface in the way you are doing if you want all the other sugar that comes with model formulas. You wrote:
c_lm = lm(trainingset$dependent ~ trainingset$independent)
You repeat trainingset
twice, which is a waste of fingers/time, redundant, and not least causing you the problem that you are hitting. When you now call predict
, it will be looking for a variable in testset
that has the name trainingset$independent
, which of course doesn't exist. Instead, use the data
argument in your call to lm()
. For example, this fits the same model as your formula but is efficient and also works properly with predict()
c_lm = lm(dependent ~ independent, data = trainingset)
Now when you call predict(c_lm, newdata = testset)
, you only need to have a data frame with a variable whose name is independent
(or whatever you have in the model formula).
An additional reason to write formulas as I show them, is legibility. Getting the object name out of the formula allows you to more easily see what the model is.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…