How to measure performance of Gaussian mixture?

Question

How to measure performance of Gaussian mixture?

3.1k views Asked by user3104352 At 03 December 2017 at 21:33

I have a data set with 27211 samples and 90 attributes. This data set has no class label. I want to fit gaussian mixture to data set but I dont know how to measure performance. Can you help me?

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import random
from sklearn.naive_bayes import GaussianNB
from sklearn import mixture

trainFile = TRAIN_PATH_NAME + "train" + str(j+1) + ".txt"
trainData = pd.read_csv(trainFile, sep=",", header=None)

np.random.seed(42)
g = mixture.GMM(n_components=60)
g.fit(trainData.values)
print("IS_COVERGED: ", g.converged_)
sampled = g.sample(trainData.values.shape[0])
return sampled

Original Q&A

There are 2 answers

silgon On 03 December 2017 at 21:41

You can use different performance evaluations for unsupervised learning. scikit-learn provides some information here. Some of the evaluations are mutual information. Also, this post could give you some insight.

**BaluJr.** · Accepted Answer · 2017-12-03T21:47:25+00:00

Since you do not have a a ground truth (labels), you cannot give a definite estimate of the performance and have to rely on a metric of choice. It is a quite common problem to assess the quality of clusters. Therefore there is ton of documentation arround:

There are several options to measure the performance of this unsupervised case. For GMM, which base on real probabilities, the most common are BIC and AIC. They are immediatly included in the scikit GMM class.

But there are many more metrics to measure the performance of general clusters. They are well described in the scikit documentation. I find Silhouette-score kind of intuitive.

TechQA.

How to measure performance of Gaussian mixture?

There are 2 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in NUMPY

Related Questions in GAUSSIAN

Related Questions in MIXTURE

Popular Questions

Popular Tags

Trending Questions