Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.5k views
in Technique[技术] by (71.8m points)

r - ggplot2: Add p-value to grouped box plots

I am trying to add p_values to my graph using "stat_signif" function.
The problem is that my boxplots are grouped box plots where I want to compare every 2 box plots of the same category and stat_signif function requires the x-axis values for comparing.
This is my code:

p <- ggplot(plot.data, aes(x = Element, y = Value, fill = Group)) + #Define the elements for plotting - group by "strandness".
geom_boxplot(outlier.shape = NA, colour = "black") +
scale_fill_manual(values = c("goldenrod","darkgreen")) +
coord_cartesian(ylim = c(0, 0.03)) +
stat_summary(fun.y=mean, colour="black", geom ="point", shape=18, size=4 ,show.legend = FALSE, position = position_dodge(0.75)) +
theme(legend.title=element_blank(),legend.text = element_text(size=16), axis.text.x = element_text(color = "black", size = 12), axis.text.y = element_text(color = "black", size = 12),
      panel.background = element_blank(),
      panel.grid.major = element_blank(), 
      panel.grid.minor = element_blank(),
      axis.line = element_line(colour = "black"),
      panel.border = element_rect(colour = "black", fill=NA, size=0.5),
      legend.key = element_rect(colour = "transparent", fill = "white")) +
theme(plot.title = element_text(lineheight=.8, hjust = 0.5, size = 20),axis.title.y = element_text(size = 20, angle = 90, margin = margin(t = 0, r = 20, b = 0, l = 0))) +
labs(x = "", y = paste0(dinuc, " frequency")) +
theme(plot.margin = unit(c(2,1,1,1), "cm")) +
#stat_compare_means(aes(group = group))
stat_signif(comparisons = list(c("Genes", "mRNA"))
            ,test = "wilcox.test", test.args = list(paired = FALSE, exact = FALSE, correct = FALSE,
                                                    map_signif_level = T), y_position = 0.02) 

Where the plot.data data frame looks like:

  Group, Value, Element
1 Transcribed, 0.004814926, Genes
2 Non-transcribed, 0.008926, Genes
3 Transcribed, 0.086000026, mRNA
4 Non-transcribed, 0.00548, mRNA
5 Transcribed, 0.258400078, Exons
6 Non-transcribed, 0.23008457, Exons
7 Transcribed, 0.00005687, Introns
8 Non-transcribed, 0.890000521, Introns

etc. (For every element there are about 10000 rows)

This is the figure obtained by the code: plot When I actually want to compare between the transcribed and non-transcribed box plots of every element.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You haven't posted enough data to get p-values, so I'm posting an example that you can adjust to your dataset:

library(tidyverse)
library(ggpubr)

mtcars %>%
  mutate_at(vars(am, cyl), as.factor) %>%
  ggplot(aes(cyl, disp, fill=am))+
  geom_boxplot()+
  stat_compare_means(aes(group = am))

enter image description here

You can usestat_compare_means(aes(group = am), label = "p.format") if you want to have only the p values in your plot.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...