Why does a Gaussian Mixture Model make different clusters each run?

Question

Why does a Gaussian Mixture Model make different clusters each run?

411 views Asked by godpleasehelp At 19 April 2022 at 18:48

I'm using Python to cluster a 5D set of data. And each run generates a different set of clusters. I'm simply curious as to why this is.

Here's the code:

    df = pd.read_csv('database.csv')
    ratios = df.drop(['patient', 'class'], axis=1)
            
    gaussian = GaussianMixture(n_components=7).fit(ratios).predict(ratios)
            
    df['gaussian'] = gaussian
    
    cluster_counts = Counter(df['gaussian'])
    centroids = NearestCentroid().fit(ratios, gaussian).centroids_
    sum_of_distances = np.zeros((len(centroids), 5))

Here's a graph showing the sum of the average distances to the centroid for one run:

And here's a graph for another run:

You can see that the bar for Gaussian mixture varies from one to another, however, no other clustering algorithms change.

If someone could explain why this happens it would be much appreciated.

Original Q&A

There are 1 answers

**Piotr Grzybowski** · Accepted Answer · 2022-04-19T18:50:58+00:00

MixtureGaussian Documentation You are interested in random_state parameter. Each time you run the model the initialization of the parameters may differ.

random_state: int, RandomState instance or None, default=None Controls the random seed given to the method chosen to initialize the parameters (see init_params). In addition, it controls the generation of random samples from the fitted distribution (see the method sample). Pass an int for reproducible output across multiple function calls.

More about random and seed in python: random.seed(): What does it do?

TechQA.

Why does a Gaussian Mixture Model make different clusters each run?

There are 1 answers

Related Questions in PYTHON

Related Questions in CLUSTER-COMPUTING

Related Questions in GAUSSIAN-MIXTURE-MODEL

Popular Questions

Popular Tags

Trending Questions