Perform Multi-Dimension Scaling (MDS) for clustered categorical data in python

348 views Asked by At

I am currently working on clustering categorical attributes that come from a bank marketing dataset from Kaggle. I have created the three clusters with kmodes:

Output: cluster_df

Now I want to visualize each row of a cluster as a projection or point so that I get some kind of image:

Desired visualization

I am having a hard time with this. I don't get a Euclidean distance with categorized data, right? That makes no sense. Is there then no possibility to create this desired visualization?

1

There are 1 answers

0
egjlmn1 On BEST ANSWER

The best way to visualize clusters is to use PCA. You can use PCA to reduce the multi-dimensional data into 2 dimensions so that you can plot and hopefully understand the data better. To use it see the following code:

from sklearn.decomposition import PCA
pca = PCA(n_components=2)
principalComponents = pca.fit_transform(x)
principalDf = pd.DataFrame(data = principalComponents
             , columns = ['principal component 1', 'principal component 2'])

where x is the fitted and transformed data on your cluster. Now u can easily visualize your clustered data since it's 2 dimensional.