For the code below:
` library(cluster)
silhouette_scores <- sapply(2:10, function(k) {
km <- kmeans(iris[, 1:4], centers = k)
silhouette(km$cluster, dist(iris[, 1:4]))
})
plot(2:10, silhouette_scores, type = "b", xlab = "Number of Clusters",
ylab = "Silhouette Score", main = "Silhouette Plot")`
I am getting the following error:
` Error in xy.coords(x, y, xlabel, ylabel, log): 'x' and 'y' lengths differ
Traceback:
1. plot(2:10, silhouette_scores, type = "b", xlab = "Number of Clusters",
. ylab = "Silhouette Score", main = "Silhouette Plot")
2. plot(2:10, silhouette_scores, type = "b", xlab = "Number of Clusters",
. ylab = "Silhouette Score", main = "Silhouette Plot")
3. plot.default(2:10, silhouette_scores, type = "b", xlab = "Number of Clusters",
. ylab = "Silhouette Score", main = "Silhouette Plot")
4. xy.coords(x, y, xlabel, ylabel, log)
5. stop("'x' and 'y' lengths differ")`
I am trying to use the Iris dataset in R to calculate silhouette scores and then create a silhouette plot to determine the number of clusters there would be after having done k-means clustering on the first two principal components of the data. PCA was done but now it is showing this error so I want to know how I can plot k=2 through k=10.
I tried to rectify it by adding plot(x, y[1:length(x)]) before the last line of code, but it didn't work. Would anyone know of another fix?