I am creating density plots with kde2d (MASS) on lat and lon data. I would like to know which points from the original data are within a specific contour.
I create 90% and 50% contours using two approaches. I want to know which points are within the 90% contour and which points are within the 50% contour. The points in the 90% contour will contain all of those within the 50% contour. The final step is to find the points within the 90% contour that are not within the 50% contour (I do not necessarily need help with this step).
# bw = data of 2 cols (lat and lon) and 363 rows
# two versions to do this:
# would ideally like to use the second version (with ggplot2)
# version 1 (without ggplot2)
library(MASS)
x <- bw$lon
y <- bw$lat
dens <- kde2d(x, y, n=200)
# the contours to plot
prob <- c(0.9, 0.5)
dx <- diff(dens$x[1:2])
dy <- diff(dens$y[1:2])
sz <- sort(dens$z)
c1 <- cumsum(sz) * dx * dy
levels <- sapply(prob, function(x) {
approx(c1, sz, xout = 1 - x)$y
})
plot(x,y)
contour(dens, levels=levels, labels=prob, add=T)
And here is version 2 - using ggplot2. I would ideally like to use this version to find the points within the 90% and 50% contours.
# version 2 (with ggplot2)
getLevel <- function(x,y,prob) {
kk <- MASS::kde2d(x,y)
dx <- diff(kk$x[1:2])
dy <- diff(kk$y[1:2])
sz <- sort(kk$z)
c1 <- cumsum(sz) * dx * dy
approx(c1, sz, xout = 1 - prob)$y
}
# 90 and 50% contours
L90 <- getLevel(bw$lon, bw$lat, 0.9)
L50 <- getLevel(bw$lon, bw$lat, 0.5)
kk <- MASS::kde2d(bw$lon, bw$lat)
dimnames(kk$z) <- list(kk$x, kk$y)
dc <- melt(kk$z)
p <- ggplot(dc, aes(x=Var1, y=Var2)) + geom_tile(aes(fill=value))
+ geom_contour(aes(z=value), breaks=L90, colour="red")
+ geom_contour(aes(z=value), breaks=L50, color="yellow")
+ ggtitle("90 (red) and 50 (yellow) contours of BW")
I create the plots with all of the lat and lon points plotted and 90% and 50% contours. I simply want to know how to extract the exact points that are within the 90% and 50% contours.
I have tried to find the z values (the elevation of the density plots from kde2d) that are associated with each row of lat and lon values but had no luck. I was also thinking I could add an ID column to the data to label each row and then somehow transfer that over after using melt()
. Then I could simply subset the data that has values of z that match each contour I want and see which lat and lon they are compared to the original BW data based on the ID column.
Here is a picture of what I am talking about:
I want to know which red points are within the 50% contour (blue) and which are within the 90% contour (red).
Note: much of this code is from other questions. Big shout-out to all those who contributed!
Thank you!
See Question&Answers more detail:
os