I am new to python. While I am removing multicollinear featrues by calculating the correlation matrix between different features in a same way as https://scikit-learn.org/stable/auto_examples/inspection/plot_permutation_importance_multicollinear.html#permutation-importance-with-multicollinear-or-correlated-features with a more specific code is presented as below:
from scipy.stats import spearmanr
from scipy.spatial.distance import squareform
from collections import defaultdict
from scipy.cluster.hierarchy import linkage, fcluster,distance
corr = spearmanr(data_all[feature_name]).correlation
# Ensure the correlation matrix is symmetric
corr = (corr + corr.T) / 2
np.fill_diagonal(corr, 1)
# We convert the correlation matrix to a distance matrix before performing
# hierarchical clustering using Ward's linkage.
distance_matrix = 1 - np.abs(corr)
y = distance.squareform(distance_matrix)
dist_linkage = linkage(y,method='ward')
I got a result as below:
The height of the dendrogram is higher than 1, the maximum value of y. Is this result reasonable? If so, how is the height defined in hierarchy.linkage?
I did double check to make sure that the maximum value of y is smaller than 1. As I understand that whether methods for linkage, that is minimum, maximum, average, or square error ("ward"), the distance based on input y shall also be smaller than 1. Besides, the results of the manual of sklearn https://scikit-learn.org/stable/auto_examples/inspection/plot_permutation_importance_multicollinear.html#permutation-importance-with-multicollinear-or-correlated-features also gives a result higher than 1. I am pretty confused on the results.