Watson conversation service did not recognize my accent.Therefore I used a custom model and here is the results for before and after using the custom model.
Test Results
Before integrating the model :- When you have a motto that they have in the. Sheila. Jabba among the. The woman. The.
After integrating the model :- We give Omatta David. Sri Lanka. Jabba among the. Number. Gov.
Actual audio- Audio 49,Wijayaba Mawatha,Kalubowila,Dehiwela,Sri Lanka.Government.Gov.
How I included the custom model- I used the same file given in the demo forked from github In the socket.js I included the customization id as shown in the picture.There where other ways of including the custom model (ways to integrate custom model) but I would like to know if the method I have done is correct?
Here is the python code I used to create the custom model. code link
Here is the corpus result I after executing the python code in JSON format.corpus file
Here is the custom model(custom model text file which was included in the code) where I have included all the Sri Lankan roads.
I forked the file and edited the socket.js as follows.
First, unless I'm missing something, several of the words you said don't actually appear in the corpus1.txt file. Obviously the service needs to know of words that you expect it to transcribe.
Next, the service is geared towards more common speech patterns. A list of arbitrary names is difficult because it can't guess a word based on it's context. This is normally what the custom corpus provides, but that doesn't work in this case (unless you happen to read the names in the exact order they appear in the corpus - and even then, they only appear once and without any context that the service would already recognize.)
To compensate for this, in addition to the corpus of custom words, you may need to provide a
sounds_like
for many of them to indicate pronunciation: http://www.ibm.com/watson/developercloud/doc/speech-to-text/custom.shtml#addWordsThis is quite a bit more work (it must be done for each word that the service doesn't recognize correctly), but should improve your results.
Third, the audio file you provided has a fair amount of background noise which will degrade your results. A better microphone/recording location/etc. will help.
Finally, speaking more clearly, with precise dictation and as close to a "standard" US English accent as you can muster should also help improve the results.