Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
550 views
in Technique[技术] by (71.8m points)

r - plot.lm Error: $ operator is invalid for atomic vectors

I have the following regression model with transformations:

fit <- lm( I(NewValue ^ (1 / 3)) ~ I(CurrentValue ^ (1 / 3)) + Age + Type - 1,
           data = dataReg)
plot(fit)                                                                      

But plot gives me the following error:

Error: $ operator is invalid for atomic vectors

Any ideas about what I am doing wrong?

Note: summary, predict, and residuals all work correctly.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is actually quite a interesting observation. In fact, among all 6 plots supported by plot.lm, only the Q-Q plot fails in this case. Consider the following reproducible example:

x <- runif(20)
y <- runif(20)
fit <- lm(I(y ^ (1/3)) ~ I(x ^ (1/3)))
## only `which = 2L` (QQ plot) fails; `which = 1, 3, 4, 5, 6` all work
stats:::plot.lm(fit, which = 2L)

Inside plot.lm, the Q-Q plot is simply produced as follow:

rs <- rstandard(fit)  ## standardised residuals
qqnorm(rs)  ## fine
## inside `qqline(rs)`
yy <- quantile(rs, c(0.25, 0.75))
xx <- qnorm(c(0.25, 0.75))
slope <- diff(yy)/diff(xx)
int <- yy[1L] - slope * xx[1L]
abline(int, slope)  ## this fails!!!

Error: $ operator is invalid for atomic vectors

So this is purely a problem of abline function! Note:

is.object(int)
# [1] TRUE

is.object(slope)
# [1] TRUE

i.e., both int and slope has class attribute (read ?is.object; it is a very efficient way to check whether an object has class attribute). What class?

class(int)
# [1] AsIs

class(slope)
# [1] AsIs

This is the result of using I(). Precisely, they inherits such class from rs and further from the response variable. That is, if we use I() on response, the RHS of the model formula, we get this behaviour.

You can do a few experiment here:

abline(as.numeric(int), as.numeric(slope))  ## OK
abline(as.numeric(int), slope)  ## OK
abline(int, as.numeric(slope))  ## fails!!
abline(int, slope)  ## fails!!

So abline(a, b) is very sensitive to whether the first argument a has class attribute or not.

Why? Because abline can accept a linear model object with "lm" class. Inside abline:

if (is.object(a) || is.list(a)) {
    p <- length(coefa <- as.vector(coef(a)))

If a has a class, abline is assuming it as a model object (regardless whether it is really is!!!), then try to use coef to obtain coefficients. The check being done here is fairly not robust; we can make abline fail rather easily:

plot(0:1, 0:1)
a <- 0  ## plain numeric
abline(a, 1)  ## OK
class(a) <- "whatever"  ## add a class
abline(a, 1)  ## oops, fails!!!

Error: $ operator is invalid for atomic vectors

So here is the conclusion: avoid using I() on your response variable in the model formula. It is OK to have I() on covariates, but not on response. lm and most generic functions won't have trouble dealing with this, but plot.lm will.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...