Spectral embedding - spectral clustering

1.7k views Asked by At

I'm trying to perform spectral embedding/clustering using Normalized Cuts. I wrote the following code but I have stuck to a logical bottleneck. What do I have to do after clustering the eigenvectors? I don't know how to form the clusters on my original dataset. (A is my affinity matrix)

D = np.diag(np.sum(A, 0))
D_half_inv = np.diag(1.0 / np.sqrt(np.sum(A, 0)))
M = np.dot(D_half_inv, np.dot((D - A), D_half_inv))
# compute eigenvectors and eigenvalues
(w, v) = np.linalg.eigh(M) 
# renorm eigenvectors to have norm 1
var = len(w)
v1 = np.array(np.zeros((var, var)))
for j in range(var):
    v[:][j] = v[:][j]/np.sqrt(np.sum(A,0))
    v[:][j] = v[:][j]/np.linalg.norm(v1[:][j])
v_trailing = v[:,1:45] #omit the corresponding eigenvector of the smallest eigenvalue     which is 0  and 45 is my embedding dimension
k  = 20 #number of clusters
centroids,idx = kmeans2(v_trailing, k)

After that, i get labels for each eigenvector. But how can i link these labels on my original dataset?

1

There are 1 answers

10
bearrito On BEST ANSWER

The output mapping to the original dataset corresponds to the indices of the labels in your modified set.

So if yi is in Cm then the ith entry of A will be in Am

or to put it another way

Let C1 ..... CM be the set of clusters generated by clustering the eigenvectors the clusters you want are : A1 ..... AM where Ai= { j | yj element of Ci }