Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
642 views
in Technique[技术] by (71.8m points)

ggplot2 - Getting counts on bins in a heat map using R

This question follows from these two topics:

How to use stat_bin2d() to compute counts labels in ggplot2?

How to show the numeric cell values in heat map cells in r

In the first topic, a user wants to use stat_bin2d to generate a heatmap, and then wants the count of each bin written on top of the heat map. The method the user initially wants to use doesn't work, the best answer stating that stat_bin2d is designed to work with geom = "rect" rather than "text". No satisfactory response is given.

The second question is almost identical to the first, with one crucial difference, that the variables in the second question question are text, not numeric. The answer produces the desired result, placing the count value for a bin over the bin in a stat_2d heat map.

To compare the two methods i've prepared the following code:

    library(ggplot2)
    data <- data.frame(x = rnorm(1000), y = rnorm(1000))
    ggplot(data, aes(x = x, y = y))
      geom_bin2d() + 
      stat_bin2d(geom="text", aes(label=..count..))

We know this first gives you the error:

"Error: geom_text requires the following missing aesthetics: x, y".

Same issue as in the first question. Interestingly, changing from stat_bin2d to stat_binhex works fine:

    library(ggplot2)
    data <- data.frame(x = rnorm(1000), y = rnorm(1000))
    ggplot(data, aes(x = x, y = y))
      geom_binhex() + 
      stat_binhex(geom="text", aes(label=..count..))

Which is great and all, but generally, I don't think hex binning is very clear, and for my purposes wont work for the data i'm trying to desribe. I really want to use stat_2d.

To get this to work, i've prepared the following work around based on the second answer:

    library(ggplot2)
    data <- data.frame(x = rnorm(1000), y = rnorm(1000))
    x_t<-as.character(round(data$x,.1))
    y_t<-as.character(round(data$y,.1))
    x_x<-as.character(seq(-3,3),1)
    y_y<-as.character(seq(-3,3),1)
    data<-cbind(data,x_t,y_t)



    ggplot(data, aes(x = x_t, y = y_t)) +
      geom_bin2d() + 
      stat_bin2d(geom="text", aes(label=..count..))+
      scale_x_discrete(limits =x_x) +
      scale_y_discrete(limits=y_y) 

This works around allows one to bin numerical data, but to do so, you have to determine bin width (I did it via rounding) before bringing it into ggplot. I actually figured it out while writing this question, so I may as well finish. This is the result: (turns out I can't post images)

So my real question here, is does any one have a better way to do this? I'm happy I at least got it to work, but so far I haven't seen an answer for putting labels on stat_2d bins when using a numerical variable.

Does any one have a method for passing on x and y arguments to geom_text from stat_2dbin without having to use a work around? Can any one explain why it works with text variables but not with numbers?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Another work around (but perhaps less work). Similar to the ..count.. method you can extract the counts from the plot object in two steps.

library(ggplot2)

set.seed(1)
dat <- data.frame(x = rnorm(1000), y = rnorm(1000))

# plot
p <- ggplot(dat, aes(x = x, y = y)) + geom_bin2d() 

# Get data - this includes counts and x,y coordinates 
newdat <- ggplot_build(p)$data[[1]]

# add in text labels
p + geom_text(data=newdat, aes((xmin + xmax)/2, (ymin + ymax)/2, 
                  label=count), col="white")

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...