Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
179 views
in Technique[技术] by (71.8m points)

How to draw plot of comparing normal and binomial distribution R?

I have to draw such type of plot but I can't understand how to do it. I have to plots of these functions. Normal:

library(tidyverse)
tibble(x = sort(rnorm(1e5)),
       cumulative = cumsum(abs(x)/sum(abs(x)))/2.5) %>%
  ggplot(aes(x)) + 
  geom_histogram(aes(y = ..density..), bins = 500)+
  geom_density(color = "red")+
  geom_line(aes(y = cumulative), color = "navy")+
  scale_y_continuous(sec.axis = sec_axis(~.*2.5, name = "cumulative density"))

and binomial:

library(tidyverse)
set.seed(10)
tibble(x = sort(rbinom(1e5,1e5, 0.001))) %>%
  ggplot(aes(x)) + 
  geom_histogram(aes(y = ..density..), bins = 90)+
  geom_density(color = "red")

and I can't understand how to make comparing of two of these functions on one plot in range [0,1]. Maybe I have to change my plots. But anyway I can't got how to add two plots at the certain range. Maybe someone know how to do it?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I am not sure what you want to get out of such comparison. Before putting the two graphs together, I think your code may have some issues: 1) your cumsum(abs(x)/sum(abs(x))) may not be correct, I replaced it with cumsum(abs(10-x)/sum(abs(10-x))). Second, for binomial distribution, rbinom(1e5,1e5, 0.001) will give you numbers not probabilities, I replaced it with rbinom(1e5,1e5, 0.001)/1e5.

library(tidyverse)
df1<-tibble(x = sort(rnorm(1e5)),
       cumulative = cumsum(abs(10-x)/sum(abs(10-x)))/2.5)
df2<-tibble(x1 = sort(rbinom(1e5,1e5, 0.001)/1e5)) 

  ggplot(df1, aes(x=x)) + 
  geom_histogram(aes(y = ..density..), bins = 500)+
  geom_density(color = "red")+
  geom_line(aes(y = cumulative), color = "navy")+
  scale_y_continuous(sec.axis = sec_axis(~.*2.5, name = "cumulative density")) + 
    geom_histogram(data = df2, aes(x = x1, y = ..density..), bins = 90) 

This will produce: enter image description here

You can change bins to adjust height. However, we need to be careful with the interpretation of the difference between two distributions: one is the distribution of individuals with mean = 0 and SD =1 (normal distribution) while the other is the distribution of population estimates with a probability of 0.001 and sample size of 1e5.

ggplot(df1, aes(x=x)) + 
  geom_histogram(aes(y = ..density..), fill="red", bins = 15)+
  geom_density(color = "red")+
  geom_line(aes(y = cumulative), color = "navy")+
  scale_y_continuous(sec.axis = sec_axis(~.*2.5, name = "cumulative density")) +  
  geom_histogram(data = df2, aes(x = x1, y = ..density..), color = "green", fill="green", bins = 15)

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...