How to solve inconsistant ploting while using ggplotly for simple boxplot?

257 views Asked by At

I am trying to utilize the new ggplot2 (dev version) feature to make my plot interactive. I also have plotly-dev version installed.

However, to me it seemed like its not working as promised. Consider this,

gg <- ggplot(data = mtcars, aes(x = factor(mtcars$cyl), y = mtcars$hp)) + 
  geom_boxplot(aes(fill = factor(mtcars$cyl), color = factor(mtcars$cyl), alpha = 1/2 )) 

gg 

ggplotly(gg)

If you notice, you'll see the outlier is marked differently in the ggploty(). How can I get rid of this, so that it looks 'same' as ggplot (the first fig)?

Then, how can I add/edit legends on ggplotly.

1

There are 1 answers

1
wxxyyyzz On BEST ANSWER

It's a little awkward to answer after more than one year. But hopefully it helps for whoever search for the same issue later, as I recently came across similar issue (when I could not hide the outliers in geom_boxplot).

Referring to plotly boxplot in R or Python, you may have boxpoints in one of these c('all', False, 'suspectedoutliers','outliers'), with different ways of displaying outliers. When 'suspectedoutliers' is chosen, you can adjust the style of the marker. Here, the difference came from the line outside the marker.

I haven't found a way to easily do it in ggplotly yet, but it's doable using some efforts in plotly_build. (you are going deep into the structure it stores data)

First,

    gg <- ggplot(data = mtcars, aes(x = factor(mtcars$cyl), y = mtcars$hp)) + 
        geom_boxplot(aes(fill = factor(mtcars$cyl), color = factor(mtcars$cyl), alpha = 1/2 ))
    ggly <- plotly_build(gg)

You can have a look at ggly$x$data. You may see a list of 3 items, which correspond to 3 boxes on the graph.

Next,

    for (i in 1:length(ggly$x$data)) {
        # ggly$x$data[[i]]$boxpoints <- "outliers"
        ggly$x$data[[i]]$marker$line <- NULL
        ggly$x$data[[i]]$line$width <- 1
    }
    rm(i)
    ggly

Since it did not had the $boxpoints argument and had the parameters for the marker and line, I removed the line outside the marker, then changed the width of the line of the box. This will fix the outlier appearance. If you also want to fix the legend, maybe you can check $legendgroup, however, I don't have a good solution yet.

By the way, if anyone is looking for a way to hide/disable outliers like in geom_boxplot, you can use 'suspectedoutliers' and style the outlier by setting its opacity to 0. (I did so in order to plot another layer of jitter later.)

    for (i in 1:length(ggly$x$data)) {
        ggly$x$data[[i]]$boxpoints <- "suspectedoutliers"
        ggly$x$data[[i]]$marker$opacity <- 0
    }
    rm(i)
    ggly

Again, hope it helps.