I'm using Microsoft's C# API for the Cognitive Services (Project Oxford) Bing Speech Recognition service. Specifically, I am using Microsoft.ProjectOxford.SpeechRecognition-x64 version 0.4.10.2.
I send audio to the DataRecognitionClient
using the SendAudio
and EndAudio
methods, and wait for the final set of recognition hypotheses via the OnResponseReceived
event. The issue I'm running into is that it's easily possible to have more than one oustanding recognition request, and the SpeechResponseEventArgs
object passed to the OnResponseReceived
handler doesn't contain any information telling me which request it is a response for.
Here's an example that has actually happened to me many times:
- Person says something, call it utterance A, and I send it via
SendAudio
and then callEndAudio
when they are done talking. - While still waiting to get the
OnResponseReceived
event for utterance A, the person says something else, call it utterance B. Again I send it viaSendAudio
and then callEndAudio
when they're done talking. I still haven't gotten anOnResponseReceived
event. - I finally get my first
OnResponseReceived
event. - I get a second
OnResponseReceived
event.
How can I correctly associate the responses with the utterances?
Is there an ordering guarantee such that if I send utterance A and then B, I will always get the response for utterance A first? I haven't seen that guarantee in the documentation.
Since all the requests are asyc requests there is no guarantee that A will always come after B. The best approach I would recommend is to create a pool of recognitionclients and use one for every recognition and manage the dependency,