Guaranteed way to associate speech recognition result with an utterance?

219 views Asked by At

I'm using Microsoft's C# API for the Cognitive Services (Project Oxford) Bing Speech Recognition service. Specifically, I am using Microsoft.ProjectOxford.SpeechRecognition-x64 version 0.4.10.2.

I send audio to the DataRecognitionClient using the SendAudio and EndAudio methods, and wait for the final set of recognition hypotheses via the OnResponseReceived event. The issue I'm running into is that it's easily possible to have more than one oustanding recognition request, and the SpeechResponseEventArgs object passed to the OnResponseReceived handler doesn't contain any information telling me which request it is a response for.

Here's an example that has actually happened to me many times:

  1. Person says something, call it utterance A, and I send it via SendAudio and then call EndAudio when they are done talking.
  2. While still waiting to get the OnResponseReceived event for utterance A, the person says something else, call it utterance B. Again I send it via SendAudio and then call EndAudio when they're done talking. I still haven't gotten an OnResponseReceived event.
  3. I finally get my first OnResponseReceived event.
  4. I get a second OnResponseReceived event.

How can I correctly associate the responses with the utterances?

Is there an ordering guarantee such that if I send utterance A and then B, I will always get the response for utterance A first? I haven't seen that guarantee in the documentation.

1

There are 1 answers

1
Patrick On BEST ANSWER

Since all the requests are asyc requests there is no guarantee that A will always come after B. The best approach I would recommend is to create a pool of recognitionclients and use one for every recognition and manage the dependency,