Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
257 views
in Technique[技术] by (71.8m points)

r - ggplot2: More complex faceting

I have a heatmap that continues to become more and more complex. An example of the melted data:

head(df2)
  Class     Subclass         Family               variable value
1     A chemosensory family_1005117 caenorhabditis_elegans    10
2     A chemosensory family_1011230 caenorhabditis_elegans     4
3     A chemosensory family_1022539 caenorhabditis_elegans    10
4     A        other family_1025293 caenorhabditis_elegans    NA
5     A chemosensory family_1031345 caenorhabditis_elegans    10
6     A chemosensory family_1033309 caenorhabditis_elegans    10
tail(df2)
     Class Subclass        Family        variable value
6496     C  class c family_455391 trichuris_muris     1
6497     C  class c family_812893 trichuris_muris    NA
6498     F  class f family_225491 trichuris_muris     1
6499     F  class f family_236822 trichuris_muris     1
6500     F  class f family_276074 trichuris_muris     1
6501     F  class f family_768194 trichuris_muris    NA

Using ggplot2 and geom_tile, I was able to produce a beautiful heatmap of the data. I am proud of the code (this is my first experience in R), so have posted it below:

df2[df2 == 0] <- NA
df2[df2 > 11] <- 10
df2.t <- data.table(df2)
df2.t[, clade := ifelse(variable %in% c("pristionchus_pacificus", "caenorhabditis_elegans", "ancylostoma_ceylanicum", "necator_americanus", "nippostrongylus_brasiliensis", "angiostrongylus_costaricensis", "dictyocaulus_viviparus", "haemonchus_contortus"), "Clade V",
                 ifelse(variable %in% c("meloidogyne_hapla","panagrellus_redivivus", "rhabditophanes_kr3021", "strongyloides_ratti"), "Clade IV",
                 ifelse(variable %in% c("toxocara_canis", "dracunculus_medinensis", "loa_loa", "onchocerca_volvulus", "ascaris_suum", "brugia_malayi", "litomosoides_sigmodontis", "syphacia_muris", "thelazia_callipaeda"), "Clade III",
                 ifelse(variable %in% c("romanomermis_culicivorax", "trichinella_spiralis", "trichuris_muris"), "Clade I",
                 ifelse(variable %in% c("echinococcus_multilocularis", "hymenolepis_microstoma", "mesocestoides_corti", "taenia_solium", "schistocephalus_solidus"), "Cestoda",
                 ifelse(variable %in% c("clonorchis_sinensis", "fasciola_hepatica", "schistosoma_japonicum", "schistosoma_mansoni"), "Trematoda", NA))))))]
df2.t$clade <- factor(df2.t$clade, levels = c("Clade I", "Clade III", "Clade IV", "Clade V", "Cestoda", "Trematoda"))
plot2 <- ggplot(df2.t, aes(variable, Family))
tile2 <- plot2 + geom_tile(aes(fill = value)) + facet_grid(Class ~ clade, scales = "free", space = "free")
tile2 <- tile2 + scale_x_discrete(expand = c(0,0)) + scale_y_discrete(expand = c(0,0))
tile2 <- tile2 + theme(axis.text.y = element_blank(), axis.ticks.y = element_blank(), legend.position = "right", axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.55), axis.text.y = element_text(size = rel(0.35)), panel.border = element_rect(fill=NA,color="grey", size=0.5, linetype="solid"))
tile2 <- tile2 + xlab(NULL)
tile2 <- tile2 + scale_fill_gradientn(breaks = c(1,2,3,4,5,6,7,8,9,10),labels = c("1", "2", "3", "4", "5", "6", "7", "8", "9", ">10"), limits = c(1, 10), colours = palette(11), na.value = "white", name = "Members")'

As you can see, there is quite a bit of manual reordering involved, otherwise the code is pretty simple. Here's the image output:

heatmap

However, you may notice that a whole column of information, "Subclass" is not utilized. Basically, each Subclass fits within a Class. It would be perfect if I was able to facet these Subclasses within the Class facet already displayed. As far as I know, this is impossible. To be precise, only Class A has varying Subclasses. The other Classes simply have their class name mirrored (F = class f). Is there another way to organize this heatmap so that I can display all of the relevant information? The missing Subclasses contain some of the most crucial data and will be the most necessary for drawing inferences from the data.

An alternative approach would be to facet the Subclases instead of the Classes, manually reorder them so that the Classes are clustered together, and then draw some sort of box around them to demarcate each Class. I have no idea how this would be done.

Any help would be very useful. Please let me know if you need any additional information.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This will put a new strip to the right of the orignal strip, and to the left of the legend.

library(ggplot2)
library(gtable)
library(grid)

p <- ggplot(mtcars, aes(mpg, wt, colour = factor(vs))) + geom_point()
p <- p + facet_grid(cyl ~ gear)

# Convert the plot to a grob
gt <- ggplotGrob(p)

# Get the positions of the right strips in the layout: t = top, l = left, ...
strip <-c(subset(gt$layout, grepl("strip-r", gt$layout$name), select = t:r))

#  New column to the right of current strip
gt <- gtable_add_cols(gt, gt$widths[max(strip$r)], max(strip$r))  

# Add grob, the new strip, into new column
gt <- gtable_add_grob(gt, 
  list(rectGrob(gp = gpar(col = NA, fill = "grey85", size = .5)),
  textGrob("Number of Cylinders", rot = -90, vjust = .27, 
        gp = gpar(cex = .75, fontface = "bold", col = "black"))), 
        t = min(strip$t), l = max(strip$r) + 1, b = max(strip$b), name = c("a", "b"))

# Add small gap between strips
gt <- gtable_add_cols(gt, unit(1/5, "line"), max(strip$r))

# Draw it
grid.newpage()
grid.draw(gt)

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...