Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
173 views
in Technique[技术] by (71.8m points)

r - A simple plot for many curves with different colors

I have the following data frame which contains 4 columns of data in addition to the vector of labels c.

Time <-c(1:4)

d<-data.frame(Time,
x1= rpois(n = 4, lambda = 10),
x2= runif(n = 4, min = 1, max = 10),
x3= rpois(n = 4, lambda = 5),
x4= runif(n = 4, min = 1, max = 5),
c=c(1,1,2,3))

I would like to use ggpolt to plot 4 curves"x1,..,x4" above each others where each curve is colored according to the label. So curves x1 and x2 are colored by the same color since they have the same label where as curves x3 and x4 in different colors.

I did the following

d %>% pivot_longer(-c(Time,x1,x2,x3,x4))%>%
   rename(class=value) %>% select(-name) %>%
   pivot_longer(-c(Time,class)) %>%
   mutate(Label=ifelse(Time==max(Time,na.rm = T),name,NA),
          Label=ifelse(duplicated(Label),NA,Label)) %>%

  ggplot(aes(x=Time,y=value,color=factor(class),group=name))+
  geom_line()+
  labs(color='class')+
  scale_color_manual(values=c('red','blue','green'))+
  geom_label_repel(aes(label = Label),
                   nudge_x = 1.5,
                   na.rm = TRUE,show.legend = F,color='black')

but I don't get the needed plot, the resulted curves are not colored according to the label. I want x1 and x2 in red, x3 in blue and x4 in green.


To add: I would like to obtain the same plot above in the following general case, where I can't add the vector c to the data frame as length(c) is not equal to length(x1)=...=length(x4)

Time <-c(1:5)
d<-data.frame(Time,
x1= rpois(n = 5, lambda = 10),
x2= runif(n = 5, min = 1, max = 10),
x3= rpois(n = 5, lambda = 5),
x4= runif(n = 5, min = 1, max = 5))

and c=c(1,1,2,3)

question from:https://stackoverflow.com/questions/65540747/a-simple-plot-for-many-curves-with-different-colors

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As you point out in your comments, it is only possible to put the vector of colors as a column in the original data.frame because it happens to be square, but this is a dangerous way to store the information because the colors really belong to the columns rather than the rows. It's better to assign the colors separately and then join into the long format data by variable name prior to plotting.

Below is an example of how I'd do this with your data.

First, prepare the data without the color mapping for each variable, we'll do that next:

# load necessary packages
library(tidyverse)
library(ggrepel)

# set seed to make simulated data reproducible
set.seed(1)

# simulate data
Time <-c(1:4)

d <- data.frame(Time,
              x1 = rpois(n = 4, lambda = 10),
              x2 = runif(n = 4, min = 1, max = 10),
              x3 = rpois(n = 4, lambda = 5),
              x4 = runif(n = 4, min = 1, max = 5))

Next, make a separate data.frame that maps the color grouping to the variable names. At some point you'll want to make this a factor (i.e. discrete rather than continuous) to map it to color so I just do it here but it can be done later in the ggplot call if you prefer. Per your request, this solution easily scales with your dataset without needing to manually set each level, but it requires that your vector of color mappings is in the same order and the same length as the variable names in d unless you have some other way to establish that relationship.

# create separate df with color groupings for variable in d
color_grouping <- data.frame(var = names(d)[-1],
                             color_group = factor(c(1, 1, 2, 3)))

Then you pivot_longer and do a join to merge the color mapping with the data for plotting.

# pivot d to long and merge in color codes
d_long <- d %>%
  pivot_longer(cols = -Time, names_to = "var", values_to = "value") %>%
  left_join(., color_grouping)

# inspect final table prior to plotting to confirm color mappings
head(d_long, 4)

# # A tibble: 4 x 4
#   Time var   value color_group
#   <int> <chr> <dbl> <fct>
# 1     1 x1     8    1
# 2     1 x2     1.56 1
# 3     1 x3     4    2
# 4     1 x4     4.97 3

Finally, generate line plot where color is mapped to the color_group variable. To ensure you get one line per original variable you also need to set group = var. For more info on this check the documentation on grouping.

# plot data adding labels for each line
p <- d_long %>%
  ggplot(aes(x = Time, y = value, group = var, color = color_group)) +
  geom_line() +
  labs(color='class') +
  scale_color_manual(values=c('red','blue','green')) +
  geom_label_repel(aes(label = var),
                   data = d_long %>% slice_max(order_by = Time, n = 1),
                   nudge_x = 1.5,
                   na.rm = TRUE,
                   show.legend = F,
                   color='black')

p

This produces the this plot:

grouped line plot

In your comment you suggested wanting to separate out and stacking the plots. I'm not sure I fully understood, but one way to accomplish this is with faceting.

For example if you wanted to facet out separate panels by color_group, you could add this line to the plot above:

p + facet_grid(rows = "color_group")

Which gives this plot:

faceted plot

Note that the faceting variable must be put in quotes.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...