Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
525 views
in Technique[技术] by (71.8m points)

r - geom_boxplot() from ggplot2 : forcing an empty level to appear

I can't find a way to ask ggplot2 to show an empty level in a boxplot without imputing my dataframe with actual missing values. Here is reproducible code :

# fake data
dftest <- expand.grid(time=1:10,measure=1:50)
dftest$value <- rnorm(dim(dftest)[1],3+0.1*dftest$time,1)

# and let's suppose we didn't observe anything at time 2

# doesn't work even when forcing with factor(..., levels=...)
p <- ggplot(data=dftest[dftest$time!=2,],aes(x=factor(time,levels=1:10),y=value))
p + geom_boxplot()

# only way seems to have at least one actual missing value in the dataframe
dftest2 <- dftest
dftest2[dftest2$time==2,"value"] <- NA
p <- ggplot(data=dftest2,aes(x=factor(time),y=value))
p + geom_boxplot()

So I guess I'm missing something. This is not a problem when dealing with a balanced experiment where these missing data might be explicit in the dataframe. But with observed data in a cohort for example, it means imputing the data with missing values for unobserved combinations... Thanks for your help.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can control the breaks in a suitable scale function, in this case scale_x_discrete. Make sure you use the argument drop=FALSE:

p <- ggplot(data=dftest[dftest$time!=2,],aes(x=factor(time,levels=1:10),y=value))
p + geom_boxplot() + 
  scale_x_discrete("time", breaks=factor(1:10), drop=FALSE)

enter image description here


I like to do my data manipulation in advance of sending it to ggplot. I think this makes the code more readable. This is how I would do it myself, but the results are the same. Note, however, that the ggplot scale gets much simpler, since you don't have to specify the breaks:

dfplot <- dftest[dftest$time!=2, ]
dfplot$time <- factor(dfplot$time, levels=1:10)

ggplot(data=dfplot, aes(x=time ,y=value)) +
    geom_boxplot() + 
    scale_x_discrete("time", drop=FALSE)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...