Changing the y-axis scale for intersection_size from ComplexUpset package

154 views Asked by At

I have created 4 different upset plots using the ComplexUpset package in R. The 4 plots have different intersection sizes since the length of the data frames range from 300 to 12000. Since, I want to compare these 4 plots, I was hoping to have a same y-axis scale for ease of clarity and discussion. I want to normalize the intersection_size data from 0 to 1.

After reading the Upset and ComplexUpset documentations, I see that the intersections are internally calculated and cannot really be extracted. I see that you still manipulate the intersections like:

'Intersection size'=intersection_size(text_mapping=aes(label=paste0(round(
            !!get_size_mode('exclusive_intersection')/!!get_size_mode('inclusive_union') * 100
        ), '%')))

but I couldn't do a normalization like

'Intersection size'=intersection_size(text_mapping=aes(label=paste0(round(
            !!get_size_mode('exclusive_intersection')/max(!!get_size_mode('inclusive_union')))

I saw How to to assign logarithmic scale to “Intersection size” using ComplexUpset library? solution from @krassowski and I'm hoping to do something similar using the geom_bar to maybe normalize instead of a log scale. For example, using the movies dataset to produce the following:

library(ComplexUpset)
library(ggplot2)

movies = as.data.frame(ggplot2movies::movies)
movies[movies$mpaa == '', 'mpaa'] = NA
movies = na.omit(movies)
genres = colnames(movies)[18:24]

plot2 <- upset(movies, genres, base_annotations=list
               ('Size'=(intersection_size(counts=FALSE))),
                min_size=5,
                width_ratio=0.1)

enter image description here

Here, instead of the y-axis scale going from 0 to 400, I would want it to go from 0 to 1, so that I can compare 4 similar upset plots.

------ Solved:------
I have done the following to normalize (y = y/max(y))the intersection size:

presence = ComplexUpset:::get_mode_presence('exclusive_intersection')
summarise_values = function(df){
    aggregate(
        as.formula(paste0(presence, '~intersection')),
        df,
        FUN = sun
    )
}

upset(
    movies,
    genres,
    base_annotations=list(
        'log10(intersection size)'=(
            ggplot()
            + geom_bar(
                data=summarise_values,
                stat='identity',
                aes(y=!!presence / max(!!presence))) 
            )
        )
    ),
    width_ratio=0.1
)

I think the results make sense as I'm seeing them, but if anyone sees any logical mistake, feel free to leave a comment.

0

There are 0 answers