I'm using sklearn.manifold.TSNE
to project onto 2-dimensional space a dataset that I've separately clustered using sklearn.clustering.KMeans
. My code is the following:
clustering = KMeans(n_clusters=5, random_state=5)
clustering.fit(X)
tsne = TSNE(n_components=2)
result = tsne.fit_transform(X)
sc = plt.scatter(x=result[:,0], y=result[:,1],
s=10, c=clustering.labels_)
The perplexity that I have is, that by repeating the process more and more, it seems that my data get clustered in totally different ways as you can see below:
I'm not an expert on clustering nor dimensionality reduction techniques, so I guess that it might be partly due to the stochastic nature of TSNE. Might it also be that I'm using too many features to perform the clustering? (132)
Did you try to set random_state parameter in TSNE ? It should probably fix it.
Fonctions that use randomness at some point have generaly an input parameter to insure that same inputs generate same outputs. This argument is generaly called random_state or seed.
Hope this will help.