I want to make a simple game which compares the pronunciation of a given word, which is provided as audio file, with the same word pronounced by the player, via a microphone. By pronounciation I mean that the "sound" of the word should be compared to the given word.
It would be ideal if the system would give back a percentage of how close the player pronounced the word to the given word.
I've found questions in StackOverflow about audio fingerprinting and speech-recognition. They seem to indicate, that its a very hard problem. But as I don't need full speech recognition maybe there is a simpler approach which I missed.
So my questions are then: Is that even feasible? If it is feasible, how could I approach the problem? Are there libraries which could support my.
You can't do this in JavaScript, but my answer to this question outlines an approach to solve the problem. You'll likely need to use C++, as the relevant SAPI interfaces aren't really exposed via C#.