I have an issue with plotting continuous distributions without sharp edges in ggplot2. I need to show two of them on one plot. Also, it does not have to be ggplot2 to achieve this result.
I am aware, that the number of data points directly influences the smoothness of the line, but it was not the case here. Below you can see some sample data (from dput)
sample.data<-list(beta.data = structure(list(cluster = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), beta.density = c(0, 3.42273368363934e-43, 8.42987148403021e-29,
2.04764468657484e-20, 1.69485562831516e-14, 6.07999638837842e-10, 2.88180370232676e-06, 0.00314064636750876, 0.954118897015866, 0, 0, 3.80101893822358e-36, 6.43342582657081e-22, 6.82956252277493e-14, 1.75291058931833e-08, 0.000131874335695378, 0.0750918340641428, 3.72532418759802, 5.05242078519544, 0), pr = c(0, 0.111111111111111, 0.222222222222222, 0.333333333333333, 0.444444444444444, 0.555555555555556, 0.666666666666667, 0.777777777777778, 0.888888888888889, 1, 0, 0.111111111111111, 0.222222222222222, 0.333333333333333, 0.444444444444444, 0.555555555555556, 0.666666666666667, 0.777777777777778, 0.888888888888889, 1)), row.names = c(NA, -20L), class = "data.frame"), beta.params = structure(list(cluster = 1:2, a = c(49, 50), b = c(2, 10), ni.beta = c(0.961,0.833), sd.beta = c(0.00072, 0.00228)), row.names = c(NA,-2L), class = "data.frame"))
Before I was using geom_col, but it discretizes values. I went with geom_area:
ggplot(sample.data$beta.data, aes(x = pr, y = beta.density)) +
geom_area(stat = "function",
fun = dbeta,
args = list(shape1 = sample.data$beta.params[1,"a"], shape2 = sample.data$beta.params[1,"b"]),
fill = "#F1C40F",
colour = "black",
alpha = 0.7) +
geom_area(stat = "function",
fun=dbeta,
args = list(shape1 = sample.data$beta.params[2,"a"], shape2 = sample.data$beta.params[2,"b"]),
fill = "#3498DB",
colour = "black",
alpha = 0.7)
I presented you the data with 10 points, but 1000 points look almost the same. It is not the case here, where even 100 points looks ok:
p = seq(0,1, length=100)
plot(p, dbeta(p, 50, 10), ylab="Density", type ="l", col=4, , lwd = 2)
Here I am attaching code to simulate the data. Oh, and these troublesome beta parameters were a = 49 and b = 2.
len <- 100
p <- seq(0,1, length.out = len)
df <- data.frame(rbind(cbind("cl" = rep(1, times = length(p)), "beta" = dbeta(p, 50, 10),"p"= p),
cbind("cl" = rep(1, times = length(p)), "beta" = dbeta(p, 40, 2),"p"= p)))
Do you have any ideas?
EDIT: The pdfs stands here for probability density functions. That is why I have not put "pdf" as a tag. My apologies for the confusion!
Anyway, when I tried to print graphic to PDF file, the result was poor as well (sharp edges). But it the end, it shouldn't matter. I want to see smooth lines whatever I do (reasonably).
EDIT2 It is possible to achieve because:
library(mosaic)
theme_set(theme_bw())
xpbeta(c(0.7, 0.90), shape1 = 49, shape2 = 2)
It produces nice, smoothed beta dist with parameters (49, 2). But then again, I need to show two dists in one chart.
question from:
https://stackoverflow.com/questions/65947332/how-to-plot-many-probability-density-functions-pdfs-without-sharp-edges