.NET Speech recognition of predefined text

299 views Asked by At

I'm developing an application where the user read some predefined text and we use a speech recognition engine the transcript what he said. Then we compare the result with the predefined text to find which sentence or part of the text he is reading.

We were using Nuance NDev as our Speech Recognition engine but it cost too much now and we are trying to find another alternative.

So I was experimenting with the .NET speech recognition engine, but I was not able to find a way to achieve this.

From my test:

  • The dictation grammar is good because it translate every word the user says, but the result is really chaotic so it's almost impossible to find a match.

  • The mix of GrammarBuilder and Choises class is more like a command => action type of thing and it does not translate all the words the user says, it just search for one particular word/command and prints it.

So what I was wondering is if there is a way to get a grammar with the dictation like behaviour but only with a subset of words, like all the words in my predefined text or something that lets me set words/sentences to help the recognition engine.

For exemple if I give the engine this predefined text :

One morning, when Gregor Samsa woke from troubled dreams, he found himself transformed in his bed into a horrible vermin.He lay on his armour-like back, and if he lifted his head a little he could see his brown belly, slightly domed and divided by arches into stiff sections.

It will only be able to return words from this text. So the recognition will be easier and more accurate.

If you have any ideas on how to achieve this or any other alternative, I'm all ears. The only limitation, is that it must support english and french language.

Thanks.

1

There are 1 answers

1
Nikolay Shmyrev On BEST ANSWER

One option would be to try pocketsphinx engine from CMUSphinx through interop C# bindings. It allows you to specify a language model compiled from the text, it will accurately detect the words then.

Models for French and English are available.