How to scale (normalise) values of ggplot2 stat_bin2d within each column (by X axis)

1.4k views Asked by At

I have a ggplot stat_bin2d "heatmap".

library(ggplot2)
value<-rep(1:5, 1000)
df<-as.data.frame(value)    
df$group<-rep(1:7, len=5000)
df<-df[sample(nrow(df), 3000), ]
ggplot(df, aes(factor(group), factor(value))) +stat_bin2d()

I have tried to add fill to aes:

aes(factor(group), factor(value),fill = (..count..)/mean(..count..))

as a way to mimic ..density.. (not accepted) does not seem to be accepted, but it is not what I am wanting - it seems to divide by the sum of the counts for the whole df. I want the count of values in each group (by x axis) normalised by the mean (or sum, or other stat) within the group. unfortunately, sum(..count..) seems to give the sum of the whole df, not only of the column.

1

There are 1 answers

1
Matt On

I know this post is ancient, but I came across it when trying to do the same thing and didn't want to use geom_tile. I was able to implement it with after_stat and a normalization function:

norm_across_y <- function(v, x, y){
    data.frame(v=v, x=x, y=y) %>%
        group_by(x) %>%
        mutate(v=v/((max(y)-min(y))/n()*sum(v))) %>%
        ungroup() %>%
        pull(v)
}

ggplot(data, aes(x=xvar, y=yvar)) +
    stat_density_2d_filled(aes(fill=after_stat(norm_across_y(density, x, y))), geom="raster", contour=FALSE, n=500) +
    geom_point(color="red", shape="x") +
    scale_x_continuous(expand=c(0,0)) +
    scale_y_continuous(expand=c(0,0)) +
    scale_fill_viridis_c(limits=c(0,NA))

Which normalizes each slice of the x axis such that the integral along the y axis would be 1 which was my use case.