Gesture recognition using hidden markov model

1.5k views Asked by At

I am currently working on a Gesture Recognition application, using a Hidden Markov Model as the classification stage on matlab(using webcam). I've completed the pre-processing part which includes extraction of feature vector. I've applied Principal Component Analysis(PCA) to these vectors.

Now for me to use Kevin Murphy's HMM toolbox, I need my observation sequence to be in the form of numbers(integers) ranging from 1 to M (M = number of observation symbols). If I'm correct then I have to use the concept of a codebook and use vector quantization to get my observation sequence.

My questions:

  1. How do I build a codebook?
  2. And How do I use this codebook to obtain the observation symbols of my input video?

Note: I've used Elliptical Fourier descriptors for shape feature extraction and for each gesture the PCA values are stored in a matrix of dimension [11x220] (Number of frames in the video = 11)

What do I do next? Is there any other way to obtain feature vectors instead of Elliptical Fourier descriptors?

1

There are 1 answers

0
Robert T. McGibbon On

An HMM is a family of probabilistic models for sequential data in which you assume that the data is generated from a discrete-state Markov chain on a latent ("hidden") state space. Generally, the so-called "emissions" come from the same family of distributions for each state, but with different parameters.

I'm not particularly familiar with the matlab implementation, but it sounds like you're referring to an implementation that is using a multinomial emission distribution, where the observed data is a sequence of symbols from a pre-specified alphabet. The unknown parameters in this model are transition probabilities between the hidden states and the multinomial weights for each output symbol in each state. This is the appropriate distribution if your features are binary and mutually exclusive -- say "gesture went to the left" vs. "gesture went to the right" or something.

But if your features are continuous, it might be more appropriate to use a continuous emissions distribution instead. For instance, Gaussian HMMs are pretty common. Here your observed data is a sequence of continuous (possible multivariate) data, and the assumption is that in each hidden state, the output is i.i.d from a gaussian with a mean and (co)variance you hope to learn.

If you're not opposed to python, there is some fairly nice documentation of both Multinomial and Gaussian HMMs on the scikits-learn page: http://scikit-learn.org/stable/modules/hmm.html.

From a practical perspective, if you're tied to using a Multinomial HMM on your data, I would suggest building the codebook first running k-means clustering and then using the state labels as input to the HMM. But using a Gaussian HMM would likely be preferable.