How to convert the .wav audio files into text and identify the intents using LUIS

535 views Asked by At

I am working bot technology, in my current project I implemented the skype calling feature in that I did recorded my voice and stored into azure storage blob but I want the functionality as how to convert audio files into text and then Identify the intents in that text using LUIS.

This is the code I wrote for uploading the recorded content into azure storage.

   private async Task OnRecordCompleted(RecordOutcomeEvent recordOutcomeEvent)
    {

        if (recordOutcomeEvent.RecordOutcome.Outcome == Outcome.Success)
        {
            var record = await recordOutcomeEvent.RecordedContent;
            string path = HttpContext.Current.Server.MapPath($"~/{recordOutcomeEvent.RecordOutcome.Id}.wav");//Wma,wav,Mp3  ~/
            using (var writer = new FileStream(path, FileMode.Create))
            {
                await record.CopyToAsync(writer);
            }
            try
            {

                var storageConnectionString = ConfigurationManager.AppSettings["RealtimeAnamoly_StorageConnectionString"];

                Debug.WriteLine(storageConnectionString);

                var storageAccount = CloudStorageAccount.Parse(storageConnectionString);

                // We are going to use Blob Storage, so we need a blob client.
                var blobClient = storageAccount.CreateCloudBlobClient();

                // Data in blobs are organized in containers.
                // Here, we create a new, empty container.
                CloudBlobContainer blobContainer = blobClient.GetContainerReference("myfirstcontainer");
                blobContainer.CreateIfNotExists();

                // Retrieve reference to a blob named "myblob".
                CloudBlockBlob blockBlob = blobContainer.GetBlockBlobReference($"{recordOutcomeEvent.RecordOutcome.Id}.wav");

                // We also set the permissions to "Public", so anyone will be able to access the file.
                // By default, containers are created with private permissions only.
                blobContainer.SetPermissions(new BlobContainerPermissions { PublicAccess = BlobContainerPublicAccessType.Blob });

                // Create or overwrite the "myblob" blob with contents from a local file.
                using (var fileStream = System.IO.File.OpenRead(path))//@"path\myfile"
                {
                    blockBlob.UploadFromStream(fileStream);
                }

                //UploadAudioFiletoLuis(path);

                recordOutcomeEvent.ResultingWorkflow.Actions = new List<ActionBase>
                {
                    GetSilencePrompt(),
                    GetPromptForText("Successfully Recorded your message! Please wait for Response")

                    //CreateIvrOptions(AthenaIVROptions.ALS,1,true)

                };

            }
            catch (Exception ex)
            {

            }
        }
        else
        {
            if (silenceTimes > 1)
            {
                recordOutcomeEvent.ResultingWorkflow.Actions = new List<ActionBase>
                {
                    GetPromptForText("Thank you for calling"),
                    new Hangup() { OperationId = Guid.NewGuid().ToString() }
                };
                recordOutcomeEvent.ResultingWorkflow.Links = null;
                silenceTimes = 0;
            }
            else
            {
                silenceTimes++;
                recordOutcomeEvent.ResultingWorkflow.Actions = new List<ActionBase>
                {
                    GetRecordForText("I didn't catch that, would you kinly repeat?")
                };
            }
        }
    }

Can you please tell how to convert the .wav audio files into text after that how to identify the intents and gets the response from LUIS?

-Pradeep

1

There are 1 answers

0
Ezequiel Jadib On BEST ANSWER

You should look at Microsoft Cognitive Services Bing Speech API as it does what you are looking for; converts audio to text.

Here there is a sample using the API. If you send a WAV file to the bot; it will respond with what the API understood from the audio.