Memory Leak with Microsoft Speech Recognition

76 views Asked by At

I’m having an issue with memory leak when running speech recognition using Microsoft.Speech.Recognition, there appears to be a memory leak issue. The recognition engine constantly grows in memory during execution, even though it is disposed of correctly. The problem is specifically observed in the following code block:

while (!stoppingToken.IsCancellationRequested && !disposed)
{
    Thread.Sleep(100);
    try
    {
        int read = originStream.Read(buffer, 0, 48000);
        bufferedByteStream.Write(buffer, 0, read);
    }
    catch (IOException ex) when (ex.InnerException is SocketException { SocketErrorCode: SocketError.ConnectionReset })
    {
        Console.WriteLine("Connection was forcibly closed by the remote host.");
        bufferedByteStream.Close();
        disposed = true;
    }
}

Context:

  • The recognition engine grows in memory while it runs in a separate thread.
  • Disposing of the engine at the end of the execution does not resolve the issue.
  • The memory growth is evident even with correct disposal practices.

Additional Information:

  • The issue happens withMicrosoft.Speech.Recognition and is also observed in System.Speech.Recognition.
  • The code is part of a background service and runs continuously.

Code Snippet for Reference:

// ... (previous code)

using (SpeechRecognitionEngine engine = SREBuilder.Create(new[] { "1FriendlyDoge", "notanoob600m", "RoyalCrests" }))
{
  engine.SpeechRecognized += (sender, e) => HandleSpeechRecognized(sender, e, userId, guildId);
  engine.SetInputToAudioStream(bufferedByteStream, new SpeechAudioFormatInfo(EncodingFormat.Pcm, 48000, 16, 2, 192000, 4, null));
  engine.RecognizeAsync(RecognizeMode.Multiple);
    
    while (!stoppingToken.IsCancellationRequested && !disposed)
    {
        Thread.Sleep(100);
        try
        {
            int read = originStream.Read(buffer, 0, 48000);
            bufferedByteStream.Write(buffer, 0, read);
        }
        catch (IOException ex) when (ex.InnerException is SocketException { SocketErrorCode: SocketError.ConnectionReset })
        {
            Console.WriteLine("Connection was forcibly closed by the remote host.");
            bufferedByteStream.Close();
            disposed = true;
        }
    }
}

For some additional information, the buffered steam is just a list of bytes with a predefined size (48,000 bytes in this case, which is 0.25s of audio data) and its size never changes.

When trying to profile the issue using dotMemory the issue seems to be a ton of Byte[] object with a stack trace of literally just [AllThreadsRoot].

The exception gets called when a socket connection closes (each socket connection has its own instance of SpeechRecognitionEngine) and only happens once per instance. After that everything is being disposed.

If I just comment engine.RecognizeAsync(RecognizeMode.Multiple); out the leak doesnt happen I have some images that may be helpful: dotMemory GC Image dotMemory GC Image2

I have tried to run a more intense GC

GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();

To no avail. Out of ideas right now, help is very much appreciated!

0

There are 0 answers