Visualise "corrplot" in R with more than one variable/parameter

492 views Asked by At

I am using the corrplot function which is part of the corrplot package in R to visualise correlations between channels of data that I have. My question is whether I could possibly assign two "factors" or parameters to the circle... So, for example, could I have the size of the circle representing the correlation and the colour of the circle represents something else for example lags? My data is actually cross-correlation data where I have identified the maximum correlation/lag and want to be able to visualise both in a figure, that's why I want to do this.

Essentially what I am trying to achieve is the same as in this question: R: using corrplot to visualize two variables (e.g., correlation and p-value) using the size and colour of the circles, however, neither of the solutions are working for me and I get an error when installing the packages from Github.

I have a vectors of lags and correlations:

CCO_lag = (0,   NaN ,   -2 ,  NaN  ,  -5  ,  -4  ,  -6  ,  -3  ,   0 ,  NaN ,    1  ,   3  , NaN   ,  0   , -3  , NaN) 
CCO_r = c(-0.4757 ,      NaN   , 0.5679    ,   NaN   , 0.5582  ,  0.5899 ,   0.5857 ,   0.6256 ,  -0.4646   ,    NaN ,  -0.6286 , -0.4699    ,   NaN ,  -0.7710  ,  0.5869  ,     NaN `)

And let's say I want to visualise this in a 1x16 square using corrplot I want the size of the circles to depend on the correlation value in CCO_r while the colour of the circles to depend on the lag value in CCO_lag.

Can anyone help?

Thanks!

1

There are 1 answers

8
Gregor Thomas On BEST ANSWER

This doesn't seem much like a correlation plot to me, but we can do this:

CCO_lag = c(0,   NaN ,   -2 ,  NaN  ,  -5  ,  -4  ,  -6  ,  -3  ,   0 ,  NaN ,    1  ,   3  , NaN   ,  0   , -3  , NaN) 
CCO_r = c(-0.4757 ,      NaN   , 0.5679    ,   NaN   , 0.5582  ,  0.5899 ,   0.5857 ,   0.6256 ,  -0.4646   ,    NaN ,  -0.6286 , -0.4699    ,   NaN ,  -0.7710  ,  0.5869  ,     NaN )

d = data.frame(id = 1:length(CCO_lag), CCO_lag, CCO_r)

ggplot(d, aes(x = id, y = "A", size = CCO_r, color = CCO_lag)) +
  geom_point() +
  scale_y_discrete(breaks = NULL) +
  labs(y = "", x = "")

enter image description here

If you've got matrices:

lag_mat = matrix(CCO_lag, 4)
r_mat = matrix(CCO_r, 4)
row = c(row(lag_mat))
col = c(col(lag_mat))

dd = data.frame(
  lag = c(lag_mat), r = c(r_mat), row, col
)

ggplot(dd, aes(x = row, y = col, size = r, color = lag)) +
  geom_point() +
  theme(panel.grid = element_blank())

enter image description here

Note that matrices have row 1 on top, with higher-numbered rows below, but plots have the lower y values on the bottom, with higher-numbered y values above. You may want to change that or it may be fine. You can add scale_y_reverse() to your plot to switch it.