Hexagon and heat style Density Plots in R

299 views Asked by At

I'm attempting to create an image that shows error of machines vs temperature and humidity. After reading a paper (see image below), it seems like the best route to go is a hexagon or density plot to show these errors. My issue is that every time I create a (1) density plot it produces a grey diagram that really shows no data whatsoever (2) a hexagram plot it only shows count data.

Example of my subset of my data (with only the temperature, humidity and PMdata included as thats what I want to display

library(ggplot2)
ggplot(DM_EPA_1H)+
geom_hex(aes(x=Relative.humidity, y=Temperature, color=Diff_PM1)

Image produced with the hex

The above image is along the lines of what I want but obviously its difficult to interpret because it has count data. I can't tell under what circumstances (temperature/humidity) are we seeing an error.

ggplot(DM_EPA_1H, aes(x=Relative.humidity,y=Temperature), na.rm = FALSE)+
stat_density_2d(aes(fill=Diff_PM1), geom = "polygon")+
scale_fill_viridis_c()

Image produced by stat_density

This above image isn't very interpretable and am unsure what the next best route is to get the desired outcome.

Desired format for displaying data. Credit Lui et al., 2019 (Atmosphere, 10, 41)

Unfortunately the above image does not have any source code for how they produced these images so is making it difficult to reproduce. It remains possible that it wasn't even done in ggplot but to me it looked like the source.

I appreciate the help. Let me know if any more clarifications are needed

1

There are 1 answers

1
shizundeiku On BEST ANSWER

Use stat_summary_hex and geom_density2d. With stat_summary_hex, you can specify what you want to calculate for each bin instead of the count; here I assumed you wanted the mean, but you can use essentially any function. Also, you made it a bit difficult by not providing any example data, so I generated some randomly.

library(tidyverse)

set.seed(0)
DM_EPA_1H = tibble(Relative.humidity = (rbeta(1000, 6, 1.3)) * 100, Temperature = rnorm(1000, mean = 50, sd = 10), Diff_PM1 = rnorm(1000, mean = 0, sd = 5))

ggplot(DM_EPA_1H, mapping = aes(x = Relative.humidity, y = Temperature)) +
  stat_summary_hex(mapping = aes(z = Diff_PM1), fun = ~mean(.x)) +
  scale_fill_steps2(low = "#eb0000", mid = "#e0e0e0", high = "#1094c4") +
  geom_hex(stat = "identity") +
  geom_density2d(colour = "black") +
  geom_point(size = 0.5)

This roughly reproduces the original plot:

roughly reproduced original plot from Lui et al. 2019

Of course, if you want to use viridis as you indicated in your second code sample, you can do that as well with scale_fill_viridis_c instead of scale_fill_steps2.