Iam using Unity SDK provided for IBM Watson services. I try to use 'ExampleStreaming.cs' sample provided for speech to text recognition. I test the app in unity editor.
This sample uses Microphone as audio input and gets results for voice input from the user. However, when I use microphone as input, the transcribed results are far from being correct. When I say "Create a black box", the results are inappropriate, with the word results being completely irrelevant to input.
When I use pre-recorded voice clips, the output is perfect. Does the service perform incorrectly for Indian accent?. What is the reason for poor microphone input recognition?
The docs say: "In general, the service is sensitive to background noise. For instance, engine noise, working devices, street noise, and talking can significantly reduce accuracy. In addition, the microphones that are typically installed on mobile devices and tablets are often inadequate. The service performs best when professional microphones are used to capture audio with better quality."
I use Logitech headset mic as input source.
Satish,
Try to "clean up" the audio as best you can - by limiting background noise. Also be aware that you can use one of two different processing models - one for broadband and one for narrowband. Try them both, and see which is most appropriate for your input device.
In addition, you can find that the underlying speech model does not handle all of the domain specific terms that you might be looking for. In these cases you can customize and expand the speech model, as explained in the documentation on Using Custom Language Models (https://console.bluemix.net/docs/services/speech-to-text/custom.html#custom). While this is a bit more involved, it can often make a huge difference in accuracy and overall usability.