Different results after repeating TSNE after KMeans clustering

191 views Asked by At

I'm using sklearn.manifold.TSNE to project onto 2-dimensional space a dataset that I've separately clustered using sklearn.clustering.KMeans. My code is the following:

clustering = KMeans(n_clusters=5, random_state=5)

clustering.fit(X)

tsne = TSNE(n_components=2)

result = tsne.fit_transform(X)

sc = plt.scatter(x=result[:,0], y=result[:,1],
            s=10, c=clustering.labels_)

The perplexity that I have is, that by repeating the process more and more, it seems that my data get clustered in totally different ways as you can see below:

enter image description here

enter image description here

enter image description here

I'm not an expert on clustering nor dimensionality reduction techniques, so I guess that it might be partly due to the stochastic nature of TSNE. Might it also be that I'm using too many features to perform the clustering? (132)

1

There are 1 answers

0
Yasser Sami On

Did you try to set random_state parameter in TSNE ? It should probably fix it.

Fonctions that use randomness at some point have generaly an input parameter to insure that same inputs generate same outputs. This argument is generaly called random_state or seed.

Hope this will help.