Input for Hidden Markov Model-based speech recognition program

277 views Asked by At

I am going to build a speech recognition program based on Hidden Markov Model. Unfortunately, I don't know how to get an input sound sequence, and, well, work with it. Can anyone tell me what is the general approach for reading values from a sound file format (i.e. .wav, .mp3, etc)and slicing a soundtrack into pieces in C++?

1

There are 1 answers

0
Dmytro Prylipko On BEST ANSWER

The general approach is to convert an input sound into the sequence of feature vectors (usually, MFCCs). This process is described in general in CMU Sphinx wiki, and described in details in HTK Book. You might also want to study the general-purpose openSMILE toolkit to see how it is done in C++.