I'm using OpenEars in my app for performing the recognition of some words and sentences. I have followed the basic tutorial for the offline speech recognition and executed a porting in Swift. This is the setup procedure
self.openEarsEventsObserver = OEEventsObserver()
self.openEarsEventsObserver.delegate = self
let lmGenerator: OELanguageModelGenerator = OELanguageModelGenerator()
addWords()
let name = "LanguageModelFileStarSaver"
lmGenerator.generateLanguageModelFromArray(words, withFilesNamed: name, forAcousticModelAtPath: OEAcousticModel.pathToModel("AcousticModelEnglish"))
lmPath = lmGenerator.pathToSuccessfullyGeneratedLanguageModelWithRequestedName(name)
dicPath = lmGenerator.pathToSuccessfullyGeneratedDictionaryWithRequestedName(name)
The recognition works well in a quiet room for both single words and whole sentences ( I would say it has a 90% hit rate). However, when I tried in quiet pub with a light background noise the app had serious difficulties in recognising even just word. Is there any way to improve the speech recognition when there is background noise?
If the background noise is more or less uniform (i.e. has a regular pattern), you can try adaptation of the acoustic model, otherwise it's an open problem sometimes referred to as the cocktail party effect, which can be part solved using DNNs.