How can I add text to points rendered with geom_jittered to label them? geom_text will not work because I don't know the coordinates of the jittered dots. Could you capture the position of the jittered points so I can pass to geom_text?
My practical usage would be to plot a boxplot with the geom_jitter over it to show the data distribution and I would like to label the outliers dots or the ones that match certain condition (for example the lower 10% for the values used for color the plots).
One solution would be to capture the xy positions of the jittered plots and use it later in another layer, is that possible?
[update]
From Joran answer, a solution would be to calculate the jittered values with the jitter function from the base package, add them to a data frame and use them with geom_point. For filtering he used ddply to have a filter column (a logic vector) and use it for subsetting the data in geom_text.
He asked for a minimal dataset. I just modified his example (a unique identifier in the label colum)
dat <- data.frame(x=rep(letters[1:3],times=100),y=runif(300),
lab=paste('id_',1:300,sep=''))
This is the result of joran example with my data and lowering the display of ids to the lowest 1%
And this is a modification of the code to have colors by another variable and displaying some values of this variable (the lowest 1% for each group):
library("ggplot2")
#Create some example data
dat <- data.frame(x=rep(letters[1:3],times=100),y=runif(300),
lab=paste('id_',1:300,sep=''),quality= rnorm(300))
#Create a copy of the data and a jittered version of the x variable
datJit <- dat
datJit$xj <- jitter(as.numeric(factor(dat$x)))
#Create an indicator variable that picks out those
# obs that are in lowest 1% by x
datJit <- ddply(datJit,.(x),.fun=function(g){
g$grp <- g$y <= quantile(g$y,0.01);
g$top_q <- g$qual <= quantile(g$qual,0.01);
g})
#Create a boxplot, overlay the jittered points and
# label the bottom 1% points
ggplot(dat,aes(x=x,y=y)) +
geom_boxplot() +
geom_point(data=datJit,aes(x=xj,colour=quality)) +
geom_text(data=subset(datJit,grp),aes(x=xj,label=lab)) +
geom_text(data=subset(datJit,top_q),aes(x=xj,label=sprintf("%0.2f",quality)))
See Question&Answers more detail:
os