How to use ggpaired to compare scores on a test before and after a lesson with weighting in lines by count?

54 views Asked by At

I am trying to use ggpaired to show an increase in scores of students before and after a lesson on a 12-point scale. I have my data frame with each students number of correct answers before and after the lesson e.g.

Before After
6 8
5 9

etc x 40 students

The issue I am having is that while ggpaired does a good job of showing the overall increase, since this is discrete data, it is losing the weight of how many students increased their score by X amount. I was able to add geom_count() which gives a sense of how many students scored X on the pre test and Y on the post-test, but am hoping to add a similar weight to the lines connecting the dots

I have produced this graph which is great, but I would love to add weight to the lines in proportion to the number of students who achieved X improvement

enter image description here

Any help is very appreciated!

I found this post which shows an example of how to do this but doesn't quite fit my use case and I am not good enough at coding in R to understand how to change it to apply to my situation: Making a ggpaired plot where line.color is a weighted function?

1

There are 1 answers

1
Allan Cameron On

I don't think there's a way to do this in ggpaired, but it's important to realise that ggpaired is just a wrapper around geom_boxplot and geom_line with some preparatory data wrangling and theme choices, all of which we can do ourselves to get the desired result:

library(tidyverse)

box_df <- df %>%
  pivot_longer(Before:After, names_to = "Condition", values_to = "Value") %>%
  mutate(Condition = factor(Condition, c("Before", "After")))

joining_lines <- df %>%
  count(Before, After) %>%
  mutate(line_group = row_number()) %>%
  pivot_longer(Before:After, names_to = "Condition", values_to = "Value") %>%
  mutate(Condition = factor(Condition, c("Before", "After")))

ggplot(box_df, aes(Condition, Value)) +
  geom_boxplot() +
  geom_count() +
  geom_line(aes(group = line_group, linewidth = n), data = joining_lines) +
  theme_classic(base_size = 16) +
  scale_linewidth(range = c(0.25, 1.5)) +
  theme(legend.position = "top")

enter image description here


Data used for example

set.seed(1)

df <- data.frame(Before = rbinom(40, 10, 0.5), After  = rbinom(40, 10, 0.75))