I have a dataframe which has categorical and numeric variables. I want to cluster this data using gower distance and get cluster values as a vector as in kmeans function. How can i achieve that?
Clustering using gower distance in R
7.7k views Asked by cuneyttyler At
2
There are 2 answers
0
On
You can use kproto()
function from clustMixType
if you don't want to insist on using Gower distance. The distance measure in kproto
is similar to Gower distance except that kproto
uses Euclidean distance to measure dissimilarity between numerical variables; however, Gower distance normalizes each variable (divides the distance between two observations by the range of that variable). The code is pretty simple.
kproto_clustering <- kproto(df, k) # k is number of cluster
clusters <- kproto_clustering$cluster
You can use the vegan package to generate your gower matrix, and then create your clusters using the cluster package.
Then you can feed that matrix into the PAM function. The example below will use the gower distance to generate 5 clusters
You can then get your cluster information from