Confused about X in GaussianHMM.fit([X])

Question

Confused about X in GaussianHMM.fit([X])

1.9k views Asked by Brooks At 11 June 2015 at 19:37

With this code:

X = numpy.array(range(0,5))
model = GaussianHMM(n_components=3,covariance_type='full', n_iter=1000)
model.fit([X])

I get

tuple index out of range 
self.n_features = obs[0].shape[1]

So what are you supposed to pass .fit() exactly? The hidden states AND emissions in a tuple? If so in what order? The documentation is less than helpful.

I noticed it likes being passed tuples as this does not give an error:

X = numpy.column_stack([range(0,5),range(0,5)])
model = GaussianHMM(n_components=3,covariance_type='full', n_iter=1000)
model.fit([X])

Edit:

Let me clarify a bit, the documentation indicates that the ordinality of the array must be:

List of array-like observation sequences (shape (n_i, n_features)).

This would almost indicate that you pass a tuple for each sample that indicates in a binary fashion which observations are present. However their example indicates otherwise:

# pack diff and volume for training
X = np.column_stack([diff, volume])

hence the confusion

Original Q&A

There are 2 answers

**Brooks** · Answer 1 · 2015-06-12T12:47:17+00:00

It would appear the GaussianHMM function is for multivariate-emission-only HMM problems, hence the requirement to have >1 emission vectors. When the documentation refers to 'n_features' they are not referring to the number of ways emissions can express themselves but the number of orthogonal emission vectors.

Hence, "features" (the orthogonal emission vectors) are not to be confused with "symbols" which, in sklearn's parlance (which is likely shared with the greater hmm community for all I know), refer to what actual unique values the system is capable of emitting.

For univariate emission-vector problems, use MultinomialHMM.

Hope that clarifies for anyone else who want to use this stuff without becoming the world's foremost authority on HMMs :)

**dixon1e** · Answer 2 · 2015-11-12T05:41:23+00:00

I realize this is an old thread but the problem in the example code is still there. I believe the example is now at this link and still giving the same error:

tuple index out of range 
self.n_features = obs[0].shape[1]

The offending line of code is: model = GaussianHMM(n_components=5, covariance_type="diag", n_iter=1000).fit(X)

Which should be: model = GaussianHMM(n_components=5, covariance_type="diag", n_iter=1000).fit([X])

TechQA.

Confused about X in GaussianHMM.fit([X])

There are 2 answers

Related Questions in PYTHON

Related Questions in SCIKIT-LEARN

Related Questions in HIDDEN-MARKOV-MODELS

Popular Questions

Trending Questions