Creating geom_hlines for two variables with two groups

61 views Asked by At

I'm struggling to get 4 total geom_hlines on the following plot:

enter image description here

I want LDL cholesterol to have its own mean hline. Here's my code - any suggestions? I think it has to do with my errorbar but I can't figure out how to add LDL cholesterol in.

    GAchol <- ggplot(data = df, aes(x=Responder, y=Ncholest, color = "Cholesterol", na.rm = TRUE)) + 
      geom_jitter() +
      geom_jitter(data = df, aes(y=Nldl_cho, color = "LDL Cholesterol")) +
      geom_errorbar(
        data = df%>% group_by(Responder) %>% summarise(Ncholest = mean(Ncholest)),
        aes(x = Responder, ymin = Ncholest, ymax = Ncholest)
      ) + geom_hline(aes(yintercept = mean(Ncholest)), lty = 2) +
      geom_jitter(data = df, aes(y=Nldl_cho, color = "LDL Cholesterol")) +
      geom_hline(aes(yintercept = mean(Nldl_cho)), lty = 2) +
      theme_bw() +
      stat_summary(fun = mean,
        geom = "errorbar",
        aes(ymax = ..y.., ymin = ..y..),
        position = position_dodge(width = 0.8),
        width = 0.8
      )
1

There are 1 answers

4
Stewart Macdonald On

I don't quite follow your code so I might be way off here, but are you after something like this?

# Set up sample data
df <- data.frame(biomarker='Cholesterol', response='Responders', lipids=rnorm(n=100, mean=-0.3, sd=1)) %>%
  bind_rows(
    data.frame(biomarker='Cholesterol', response='Non-responders', lipids=rnorm(n=100, mean=0.5, sd=1)),
    data.frame(biomarker='LDL cholesterol', response='Responders', lipids=rnorm(n=100, mean=-0.4, sd=1)),
    data.frame(biomarker='LDL cholesterol', response='Non-responders', lipids=rnorm(n=100, mean=0.6, sd=1))
  ) %>%
  mutate(
    biomarker = as.factor(biomarker),
    response = as.factor(response)
  )

# Preview the first two rows for each of the four groups
df[c(1:2, 101:102, 201:202, 301:302), ]
          biomarker       response     lipids
1       Cholesterol     Responders -1.1312455
2       Cholesterol     Responders  0.5153858
101     Cholesterol Non-responders  1.4085121
102     Cholesterol Non-responders -0.3848261
201 LDL cholesterol     Responders -0.3880410
202 LDL cholesterol     Responders -0.8081946
301 LDL cholesterol Non-responders -0.3934018
302 LDL cholesterol Non-responders  0.4481896

# Simplified plotting code
ggplot(data=df, aes(x=response, y=lipids, col=biomarker)) +
  geom_jitter() +
  stat_summary(fun=mean, geom="crossbar", width=0.5, aes(color=biomarker)) +
  theme_bw()

Scatterplot

I used the stat_summary code from this answer.