Clustering using gower distance in R

7.7k views Asked by At

I have a dataframe which has categorical and numeric variables. I want to cluster this data using gower distance and get cluster values as a vector as in kmeans function. How can i achieve that?

2

There are 2 answers

2
KaanKaant On BEST ANSWER

You can use the vegan package to generate your gower matrix, and then create your clusters using the cluster package.

gow.mat <- vegdist(dataframe, method="gower")

Then you can feed that matrix into the PAM function. The example below will use the gower distance to generate 5 clusters

clusters <- pam(x = gow.mat, k = 5, diss = TRUE)

You can then get your cluster information from

clusters$clustering
0
Mehmet Yildirim On

You can use kproto() function from clustMixType if you don't want to insist on using Gower distance. The distance measure in kproto is similar to Gower distance except that kproto uses Euclidean distance to measure dissimilarity between numerical variables; however, Gower distance normalizes each variable (divides the distance between two observations by the range of that variable). The code is pretty simple.

kproto_clustering <- kproto(df, k)   # k is number of cluster
clusters <- kproto_clustering$cluster