The behavior you're seeing is due to how ggplot2
deals with data that are outside the axis ranges of the plot. You can change this behavior depending on whether you use scale_y_continuous
(or, equivalently, ylim
) or coord_cartesian
to set axis ranges, as explained below.
library(ggplot2)
# All points are visible in the plot
ggplot(mtcars, aes(mpg, hp)) +
geom_point()
In the code below, one point with hp = 335 is outside the y-range of the plot. Also, because we used scale_y_continuous
to set the y-axis range, this point is not included in any other statistics or summary measures calculated by ggplot, such as the linear regression line.
ggplot(mtcars, aes(mpg, hp)) +
geom_point() +
scale_y_continuous(limits=c(0,300)) + # Change this to limits=c(0,335) and the warning disappars
geom_smooth(method="lm")
Warning messages:
1: Removed 1 rows containing missing values (stat_smooth).
2: Removed 1 rows containing missing values (geom_point).
In the code below, the point with hp = 335 is still outside the y-range of the plot, but this point is nevertheless included in any statistics or summary measures that ggplot calculates, such as the linear regression line. This is because we used coord_cartesian
to set the y-axis range, and this function does not exclude points that are outside the plot ranges when it does other calculations on the data.
If you compare this and the previous plot, you can see that the linear regression line in the second plot has a slightly steeper slope, because the point with hp=335 is included when calculating the regression line, even though it's not visible in the plot.
ggplot(mtcars, aes(mpg, hp)) +
geom_point() +
coord_cartesian(ylim=c(0,300)) +
geom_smooth(method="lm")
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…