Custom group mean function for ggpairs

897 views Asked by At

As per the documentation of the ggpairs() function in the GGally R package, it is possible to specify custom functions as input to the "lower"/"upper" argument. For continuous-discrete variable combinations, I would like to simply display the means of the continuous variable within the categories of the categorical variable (preferably using dots, not bars), if possible further stratified by another categorical variable using a color aesthetic.

I found some information in the following thread:

https://github.com/ggobi/ggally/issues/218

However my knowledge of ggpairs (and ggplot2) is too superficial to be able to produce such a custom function from the template in the thread (also, the variable name "Species" appears to be hard-coded into the template and I would prefer to not have any hardcoded information in the custom function if at all possible).

I would be very grateful if somebody could help me out with a template or a sketch of a solution, e.g. using the following example (where "custom_function" would need to be replaced with the function described above):

dat <- reshape::tips
pm <- ggpairs(dat,
              mapping = aes(color = sex, alpha = 0.3),
              columns = c("total_bill", "smoker", "time", "tip"),
              showStrips = T,
              lower = list(combo = custom_function))
print(pm)
1

There are 1 answers

0
h_bauer On BEST ANSWER

Based on the comment of @aosmith I made a custom function which seems to work well enough for my purposes, haven't extensively tested it so far, but maybe it is helpful anyway:

library(GGally)
library(ggplot2)
library(ggstance)

gmean_point <- function(data, mapping, ...) {

  x <- eval(mapping$x, data)
  y <- eval(mapping$y, data)

  if(is.numeric(y)) {
    p <- ggplot(data) +
      geom_blank(mapping) +
      stat_summary(mapping,
                   geom = 'point', fun.y = mean,
                   position = position_dodge(width = 0.2))
  } else {
    p <- ggplot(data) +
      geom_blank(mapping) +
      stat_summaryh(mapping,
                    geom = 'point', fun.x = mean,
                    position = position_dodgev(height = 0.2))
  }

  p

}

pm <- ggpairs(reshape::tips,
              mapping = aes(color = sex, alpha = 0.3),
              columns = c("total_bill", "smoker", "time", "tip"),
              showStrips = T,
              lower = list(combo = gmean_point),
              upper = list(combo = 'box'))
print(pm)

Plot produced by code above