Open source tools for recognizing untranscribed speech without a dictionary

239 views Asked by At

Just doing some general research. Are there any open source (or even paid?) tools / programs that do the following:

INPUT: an audio file of some unlabeled speech, maybe a few sentences long, (no indication as to what the phonetic transcriptions are in the audio)

OUTPUT: an audio file with phonetic transcriptions (in the IPA alphebet) aligned and labeled on the audio

Is this possible to be done with just a phonetic dictionary and without a word dictionary?

1

There are 1 answers

0
madmik3 On BEST ANSWER

Sphinx has an all phone feature that will produce this kind of output hypothesis. But most speech recognition is improved strongly by utilization of a phonetic dictionary and n-gram language model. It's possible to use those things in the creation of the hypothesis and then convert that in to labeled aligned phonemes with Sphinx.

Here is an example for just phonetic stuff.

http://cmusphinx.sourceforge.net/wiki/phonemerecognition

But I have been out of the speech rec game for a long time. I believe most people are pursuing neural nets now for these kinds of concepts and I don't know any open neural nets in that space.