Is there a good tool for detecting SignalR-Core Hub deadlocks and thread hangs?

329 views Asked by At

The Question: Is there a good tool for monitoring performance and diagnostics on a web based (IIS) SignalR-Core Hub ? Specifically for detecting possible Deadlocks or hanged threads. I have looked at a few tools such as LeanSentry, but they seem to ignore WebSockets for the most part.

First some background info: We have a SaaS Web Application that uses SignalR to push real-time updates to a SPA page (including, but not limited to, changing what is shown on the page during a live stream and/or video meeting). Since there can be anywhere from 10 - 1000+ concurrent attendees during one of these "live events" we use in-memory short term caching (0.5sec - 15sec depending on the data) in the hub (usually a static Dictionary<>) to prevent all of them having to make the same database query at the same time (first one does the query if the Dictionary data is stale, the rest gets the fresh data from the Dictionary cache).

To prevent async Read/Write errors we use the SemaphoreSlim class while reading or writing from any of the Dictionaries, with each dictionary getting it's own Semaphore instance.

The problem: Recently one of these Semaphores appears to have suffered a "hang", blocking all other threads from accessing that codeblock. I have been unable to reproduce it and I'm at a loss for how to proceed. From the symptoms at the time it looked like a deadlock scenario, however that should not be possible since there was only a single Semaphore involved. I have not even been able to confirm that the code snippet below is where the hang occurred, only that it was the biggest symptom at the time, and since it happened during a live event i did not have time for proper debugging only rapid damage control (resolved it by manually restarting the IIS Application pool).

try
{
    await _vonageSessionsSp.WaitAsync();
    if (_activeVonageSessions.ContainsKey($"{eventId}|{channel}") && _activeVonageSessions[$"{eventId}|{channel}"].Fetched > DateTime.Now.AddSeconds(-5))
    {

    }
    else
    {
        if (!_activeVonageSessions.ContainsKey($"{eventId}|{channel}"))
        {
            _activeVonageSessions.Add($"{eventId}|{channel}", new ActiveVonageSession());
        }

        var dbSession = await _context.VonageSession.FirstOrDefaultAsync(x => x.EventId == eventId && x.Channel == channel);
        if (dbSession != null) //would have entered here
        {
            _activeVonageSessions[$"{eventId}|{channel}"].Fetched = DateTime.Now;
            _activeVonageSessions[$"{eventId}|{channel}"].SessionId = dbSession.SessionCode;
            _activeVonageSessions[$"{eventId}|{channel}"].ScreenshareLayout = dbSession.ScreenshareLayout;
            _activeVonageSessions[$"{eventId}|{channel}"].ActiveRecordingId = dbSession.ActiveRecordingId;
        }
        else//this would have been done weeks prior during the event setup process
        {
            var session = _OTClient.CreateSession("", MediaMode.ROUTED, ArchiveMode.MANUAL);
            await _context.VonageSession.AddAsync(new Domain.Entities.VonageSession() { EventId = eventId, Channel = channel, SessionCode = session.Id, Inserted = DateTime.Now, ScreenshareLayout = "singleSpeaker", ActiveRecordingId = null });
            await _context.SaveChangesAsync();
            _activeVonageSessions[$"{eventId}|{channel}"].Fetched = DateTime.Now;
            _activeVonageSessions[$"{eventId}|{channel}"].SessionId = session.Id;
            _activeVonageSessions[$"{eventId}|{channel}"].ScreenshareLayout = "singleSpeaker";
            _activeVonageSessions[$"{eventId}|{channel}"].ActiveRecordingId = null;
        }
    }
    var activeSession = _activeVonageSessions[$"{eventId}|{channel}"];
    return activeSession;
}
finally
{
    _vonageSessionsSp.Release();
}
0

There are 0 answers