Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
305 views
in Technique[技术] by (71.8m points)

r - Connecting mean points of a line plot in ggplot2

I have a sample dataset with the columns: PATIENTID (IDs of patients), VISITNUMBER (their number of visits to the hospital), TIME (time in years since first visit), HEALTH (their health status). I am trying to plot HEALTH over time.

This is my code in R:

# data structure
PATIENTID <- c(126, 126, 126, 255, 255, 389, 389, 389, 389, 389, 470, 470, 470)
VISITNUMBER <- c(1, 2, 3, 1, 2, 1, 2, 3, 4, 5, 1, 2, 3)
TIME<- c(0, 4, 6, 0, 3, 0, 1, 2, 3, 4, 0, 1, 2)
HEALTH <- c(0.333, 0.452, 0.468, 0.571, 0.522, 0.444, 0.452, 0.431, 0.510, 0.532, 0.214, 0.333, 0.400)

mydata <- data.frame(PATIENTID, VISITNUMBER, TIME, HEALTH)


# converting patient ID and visit number to factor 

mydata$PATIENTID   <- factor(mydata$PATIENTID)
mydata$VISITNUMBER <- factor(mydata$VISITNUMBER)

# creating a spagetti plot of health over time 

sp_HEALTH <- ggplot(data = mydata, aes(TIME, HEALTH, group=PATIENTID))
sp_HEALTH + 
  geom_line() + 
  stat_smooth(aes(group=1), method = "lm", se = FALSE) + 
  stat_summary(aes(group=1), geom = "point", fun.y = mean, 
               shape = 17, size = 3, col = "red")

This is my plot that's generated as a result of this code:

Spagetti plot

My issue is that I am trying to figure out a way to connect the mean points (shown in red in the above link) using a blue line that goes from point to point but I get this straight regression type of line. I want it to be like how a regular line plot connects points using lines (please click link below). How do I insert a line that connects the mean points?

Sample line plot

Thank you!

question from:https://stackoverflow.com/questions/66057114/connecting-mean-points-of-a-line-plot-in-ggplot2

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Perhaps easier to use dplyr::mutate to calculate the mean, then add separate geoms for patient and mean values?

library(dplyr)
library(ggplot2)

mydata %>% 
  mutate(PATIENTID = factor(PATIENTID)) %>% 
  group_by(TIME) %>% 
  mutate(MEAN = mean(HEALTH)) %>% 
  ungroup() %>% 
  ggplot() + 
  geom_line(aes(TIME, HEALTH, group = PATIENTID)) + 
  geom_line(aes(TIME, MEAN), color = "blue") + 
  geom_point(aes(TIME, MEAN), color = "red", size = 3, shape = 17)

Or you could just add a second stat_summary with geom = "line". Note in both cases how aes() is used in the geom, not the ggplot().

mydata %>% 
  ggplot() +
  geom_line(aes(TIME, HEALTH, group=PATIENTID)) + 
  stat_summary(aes(TIME, HEALTH), geom = "point", fun = mean, shape = 17, size = 3, col = "red") + 
  stat_summary(aes(TIME, HEALTH), geom = "line",  fun = mean, col = "blue")

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...