Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
183 views
in Technique[技术] by (71.8m points)

r - Creating multiple graphs based upon the column names

This is my first question on stackoverlow, please correct me if I am not following correct question protocols.

I am trying to create some graphs for data that has been collected over three time points (time 1, time 2, time 3) which equates to X1..., X2... and X3... at the beginning of column names. The graphs are also separated by the column $Group from the data frame.

I have no problem creating the graphs, I just have many variables (~170) and am wanting to compare time 1 vs time 2, time 2 vs time 3, etc. so am trying to work a shortcut to be running this kind of code rather than having to type out each one individually.

As indicated above, I have created variable names like X1... X2... which indicate the time that the variable was recorded i.e. X1BCSTCAT = time 1; X2BCSTCAT = time 2; X3BCSTCAT = time 3. Here is a small sample of what my data looks like:

df <- structure(list(ID = structure(1:6, .Label = c("101","102","103","118","119","120"), class = "factor"), 
                   Group = structure(c(1L,1L,1L,2L,2L,2L), .Label = c("C8","TC"), class = "factor"), 
                   Wave = structure(c(1L, 2L, 3L, 4L, 1L, 2L), .Label = c("A","B","C","D"), class = "factor"), 
                   Yr = structure(c(1L, 2L, 1L, 2L, 1L, 2L), .Label = c("3","5"), class = c("ordered", "factor")), 
                   Age.Yr. = c(10.936,10.936, 9.311, 10.881, 10.683, 11.244), 
                   Training..hr. = c(10.667,10.333, 10.667, 10.333, 10.333, 10.333), 
                   X1BCSTCAT = c(-0.156,0.637,-1.133,0.637,2.189,1.229), 
                   X1BCSTCR = c(0.484,0.192, -1.309, 0.912, 1.902, 0.484), 
                   X1BCSTPR = c(-1.773,0.859, 0.859, 0.12, -1.111, 0.12), 
                   X2BCSTCAT = c(1.006, -0.379,-1.902, 0.444, 2.074, 1.006), 
                   X2BCSTCR = c(0.405, -0.457,-1.622, 1.368, 1.981, 0.168), 
                   X2BCSTPR = c(-0.511, -0.036,2.189, -0.036, -0.894, 0.949),
                   X3BCSTCAT = c(1.18, -1.399,-1.399, 1.18, 1.18, 1.18), 
                   X3BCSTCR = c(0.967, -1.622, -1.622,0.967, 0.967, 1.255), 
                   X3BCSTPR = c(-1.282, -1.282, 1.539,1.539, 0.792, 0.792)), 
              row.names = c(1L, 2L, 3L, 4L, 5L,8L), class = "data.frame")

Here is some working code to create one graph using ggplot for time 1 vs time 2 data on one variable:

library(ggplot2)

p <- ggplot(df, aes(x=df$X1BCSTCAT, y=df$X2BCSTCAT, shape = df$Group, color = df$Group)) + 
  geom_point() + geom_smooth(method=lm, aes(fill=df$Group), fullrange = TRUE) + 
  labs(title="BCSTCAT", x="Time 1", y = "Time 2") + 
  scale_color_manual(name = "Group",labels = c("C8","TC"),values = c("blue", "red")) +
  scale_shape_manual(name = "Group",labels = c("C8","TC"),values = c(16, 17)) +
  scale_fill_manual(name = "Group",labels = c("C8", "TC"),values = c("light blue", "pink"))

So I am really trying to create some kind of a shortcut where R will cycle through and match up variable names X1... vs X2... and so on and create the graphs. I assume there must be some way to plot either based upon matching column numbers e.g. df[,7] vs df[,10] and iterating through this process or plotting by actually matching the names (where the only difference in variable names is the number which indicates time).

I have previously cycled through creating individual graphs using the lapply function, but have no idea where to even start with trying to do this one.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

A solution using tidyeval approach. We will need ggplot2 v3.0.0 (remember to restart your R session)

install.packages("ggplot2", dependencies = TRUE)
  • First we build a function that takes column and group names as inputs. Note the use of rlang::sym, rlang::quo_name & !!.

  • Then create 2 name vectors for x- & y- values so that we can loop through them simultaneously using purrr::map2.

library(rlang)
library(tidyverse)

df <- structure(list(ID = structure(1:6, .Label = c("101","102","103","118","119","120"), class = "factor"), 
                   Group = structure(c(1L,1L,1L,2L,2L,2L), .Label = c("C8","TC"), class = "factor"), 
                   Wave = structure(c(1L, 2L, 3L, 4L, 1L, 2L), .Label = c("A","B","C","D"), class = "factor"), 
                   Yr = structure(c(1L, 2L, 1L, 2L, 1L, 2L), .Label = c("3","5"), class = c("ordered", "factor")), 
                   Age.Yr. = c(10.936,10.936, 9.311, 10.881, 10.683, 11.244), 
                   Training..hr. = c(10.667,10.333, 10.667, 10.333, 10.333, 10.333), 
                   X1BCSTCAT = c(-0.156,0.637,-1.133,0.637,2.189,1.229), 
                   X1BCSTCR = c(0.484,0.192, -1.309, 0.912, 1.902, 0.484), 
                   X1BCSTPR = c(-1.773,0.859, 0.859, 0.12, -1.111, 0.12), 
                   X2BCSTCAT = c(1.006, -0.379,-1.902, 0.444, 2.074, 1.006), 
                   X2BCSTCR = c(0.405, -0.457,-1.622, 1.368, 1.981, 0.168), 
                   X2BCSTPR = c(-0.511, -0.036,2.189, -0.036, -0.894, 0.949),
                   X3BCSTCAT = c(1.18, -1.399,-1.399, 1.18, 1.18, 1.18), 
                   X3BCSTCR = c(0.967, -1.622, -1.622,0.967, 0.967, 1.255), 
                   X3BCSTPR = c(-1.282, -1.282, 1.539,1.539, 0.792, 0.792)), 
              row.names = c(1L, 2L, 3L, 4L, 5L,8L), class = "data.frame")

# define a function that accept strings as input
pair_plot <- function(x_var, y_var, group_var) {

  # convert strings to symbols
  x_var <- rlang::sym(x_var)
  y_var <- rlang::sym(y_var)
  group_var <- rlang::sym(group_var)

  # unquote symbols using !! 
  ggplot(df, aes(x = !! x_var, y = !! y_var, shape = !! group_var, color = !! group_var)) + 
    geom_point() + geom_smooth(method = lm, aes(fill = !! group_var), fullrange = TRUE) + 
    labs(title = "BCSTCAT", x = rlang::quo_name(x_var), y = rlang::quo_name(y_var)) +
    scale_color_manual(name = "Group", labels = c("C8", "TC"), values = c("blue", "red")) +
    scale_shape_manual(name = "Group", labels = c("C8", "TC"), values = c(16, 17)) +
    scale_fill_manual(name = "Group",  labels = c("C8", "TC"), values = c("light blue", "pink")) +
    theme_bw()
}

# Test if the new function works
pair_plot("X1BCSTCAT", "X2BCSTCAT", "Group")

# Create 2 parallel lists 
list_x <- colnames(df)[-c(1:6, (ncol(df)-2):(ncol(df)))]
list_x
#> [1] "X1BCSTCAT" "X1BCSTCR"  "X1BCSTPR"  "X2BCSTCAT" "X2BCSTCR"  "X2BCSTPR"

list_y <- lead(colnames(df)[-(1:6)], 3)[1:length(list_x)]
list_y
#> [1] "X2BCSTCAT" "X2BCSTCR"  "X2BCSTPR"  "X3BCSTCAT" "X3BCSTCR"  "X3BCSTPR"

# Loop through 2 lists simultaneously 
# Supply inputs to pair_plot function using purrr::map2
map2(list_x, list_y, ~ pair_plot(.x, .y, "Group"))

Sample outputs:

#> [[1]]

#> 
#> [[2]]

Created on 2018-05-24 by the reprex package (v0.2.0).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...