Scikit Spectral Clustering fails to classify concentric circles

736 views Asked by At

Here is some code to set up the clustering problem:

import numpy as np
import matplotlib.pyplot as plt

# KMeans
# # Class=2
# Center(2.5,2.5), r1 = 2, r2 = 1
X1 = np.zeros(500*4)
X2 = np.zeros(500*4)

r1 = 2; r2 = 1; a = 2.5; b = 2.5 # generate circle

h = np.random.uniform(0, 2*np.pi, 1000)
noise = np.random.normal(0, 0.1, 1000)
X1[:1000] = np.cos(h) * r1 + a + noise
noise = np.random.normal(0, 0.1, 1000)
X2[:1000] = np.sin(h) * r1 + a + noise

h = np.random.uniform(0, 2*np.pi, 1000)
noise = np.random.normal(0, 0.1, 1000)
X1[1000:] = np.cos(h) * r2 + b + noise
noise = np.random.normal(0, 0.1, 1000)
X2[1000:] = np.sin(h) * r2 + b + noise

X = np.array([X1,X2]).T

plt.figure(figsize=(4,4))
plt.scatter(X[:,0],X[:,1])

From the following image, we assume that there are two clusters. All points in the inner circle should belong to one, and the outer circle should belong to another.

the image

By scikit-learn, we have this code with RBF kernel:

from sklearn.cluster import SpectralClustering
clustering = SpectralClustering(n_clusters=2,assign_labels='kmeans', affinity='rbf',random_state=0).fit(X)
print(clustering.labels_)

plt.figure(figsize=(4,4))
X_C1 = np.array([X[i,:] for i in range(len(clustering.labels_)) if clustering.labels_[i] == 1])
X_C2 = np.array([X[i,:] for i in range(len(clustering.labels_)) if clustering.labels_[i] == 0])
plt.scatter(X_C1[:,0],X_C1[:,1],c="blue")
plt.scatter(X_C2[:,0],X_C2[:,1],c="red")
plt.show()

But it seems that the spectral clustering doesn't work (as bad KMeans clustering). So what is the problem here?

By scikit-learn

1

There are 1 answers

3
Alexander L. Hayes On

The default gamma=1.0 parameter is not high enough for this application.

Try gamma=6.0:

from sklearn.cluster import SpectralClustering

clustering = SpectralClustering(n_clusters=2, gamma=6.0).fit(X)

plt.scatter(X[:, 0], X[:, 1], c=clustering.labels_)
plt.show()

Two concentric circles are plotted on a grid. The inner circle is yellow, and the outer circle is violet. A higher gamma value solved this problem.