I used the R package corrplot
to visualize the correlation matrix from my data. I involved the clustering of variables using the embedded option hclust.
The invocation of the command was like this (plus various arrangements of titles, axes etc):
corrplot(Rbas,type="upper",order="hclust",method="ellipse")
But now I perform some analysis and visualizations using other packages, and the question arose about the compatibility of results. In particular, I have to repeat manually the clustering of the correlation matrix. But from the documentation to corrplot
there is one obscure point: what dissimilarity measure was used in corrplot behind its reasonable defaults? Whether this is 1-|corr|, sqrt(1-corr^2), or anything else? In literature there are multiple choices, for example, as described in this article
Update to answer own question. I performed a guess trial, using the dissimilarity measure in the form 1-corr. That is I coded (Rbas is the correlation matrix):
dissim1<-1-Rbas
dist1<-as.dist(dissim1)
plot(hclust(dist1))
and recovered the ordering of variables, coinciding with the one suggested by default corrplot
with hclust
invocation. But it is not clear whether this is indeed their used mechanism and whether this will hold for any other matrix?
The function used by
corrplot
to reorder variables iscorrMatOrder
(try?corrMatOrder
).It returns a single permutation vector.
When
order= "hclust"
is selected incorrplot
,corrMatOrder
invokes thecorrplot:::reorder_using_hclust
function:This function uses
1-corr
as dissimilarity measure.