Extremely slow groupby and join Linq query

63 views Asked by At

I have a query where I want a left join on 3 tables then groupby and perform some logic. However, the query is super slow. I think the FirstOrDefault() could be the culprit here but cant work out why it is so slow. Can anybody spot where I may be going wrong here, and suggest ways to improve the performance of the query :)

var result =
    from admins in _db.ContactAdmins
    where admins.SupervisorId == assignmentNumber
    from contact in _db.ContactShifts.Where(x => admins.AssignmentNumber == x.AssignmentNumber).DefaultIfEmpty()
    from esr in _db.VEsrRecords.Where(x => admins.AssignmentNumber == x.AssignmentNumber).DefaultIfEmpty()
    group new
    {
        esr,
        contact
    }
    by new
    {
        AssignmentNumber = esr.AssignmentNumber,
        Forename = esr.FirstName,
        Surname = esr.LastName,
        Email = esr.EmailAddress,
        Role = esr.PositionName,
    }
    into dashboard
    select new
    {
        Id = dashboard.Where(x => x.contact.IsActive).Select(x => x.contact.Id).FirstOrDefault(),
        AssignmentNumber = dashboard.Key.AssignmentNumber,
        Forename = dashboard.Key.Forename,
        Surname = dashboard.Key.Surname,
        Email = dashboard.Key.Email,
        Role = dashboard.Key.Role,
        ContactCount = dashboard.Where(x => x.contact.ContactShiftStart.Value.Year == DateTime.Now.Year).Sum(x => x.contact.ContactCount),
        IsComplete = dashboard.OrderByDescending(x => x.contact.Id).Where(x => x.contact.IsComplete == true).Select(x => x.contact.IsComplete).FirstOrDefault(),
        IsScheduled = dashboard.Where(x => x.contact.IsScheduled == true).Select(x => x.contact.IsScheduled).FirstOrDefault(),
        IsStarted = dashboard.Where(x => x.contact.IsStarted == true).Select(x => x.contact.IsStarted).FirstOrDefault(),
        IsActive = dashboard.Where(x => x.contact.IsActive == true).Select(x => x.contact.IsActive).FirstOrDefault(),
        SentForReview = dashboard.Where(x => x.contact.IsActive == true).Select(x => x.contact.HasLeadSentForReview).FirstOrDefault(),
        ClinicianSignOff = dashboard.Where(x => x.contact.IsActive == true).Select(x => x.contact.HasClinicianSignedOff).FirstOrDefault(),
        LatestContact = dashboard.OrderByDescending(x => x.contact.Id).Where(x => x.contact.IsComplete == true).Select(x => x.contact.ContactShiftStart).FirstOrDefault().ToString(),
        ScheduledContact = dashboard.Where(x => x.contact.IsActive).Select(x => x.contact.ContactShiftStart).FirstOrDefault().ToString(),
        Guid = dashboard.Where(x => x.contact.IsActive == true).Select(x => x.contact.GuidId).FirstOrDefault()
    };

return new JsonResult(result);
1

There are 1 answers

0
Svyatoslav Danyliv On

This query should be written without grouping. GroupBy has only one effective usage - calculate aggregates. In your cse, you are trying to retrieve details from the group. If you are using EF Core 7 or higher, you can use GroupJoin.

var thisYearStart = new DateTime(DateTime.Now.Year, 1, 1);
var nextYearStart = thisYearStart.AddYears(1);

var result =
    from admins in _db.ContactAdmins
    where admins.SupervisorId == assignmentNumber
    join esr in _db.VEsrRecords on admins.AssignmentNumber equals esr.AssignmentNumber
    join contact in _db.ContactShifts on admins.AssignmentNumber equals contact.AssignmentNumber into contactGroup
    from activeContact in contactGroup.Where(x => x.IsActive)
        .Take(1)
        .DefaultIfEmpty()
    select new
    {
        Id = activeContact.Id,
        AssignmentNumber = esr.AssignmentNumber,
        Forename = esr.Forename,
        Surname = esr.Surname,
        Email = esr.Email,
        Role = esr.Role,
        ContactCount = contactGroup.Where(x => x.ContactShiftStart.Value >= thisYearStart && x.ContactShiftStart.Value < nextYearStart).Sum(x => x.ContactCount),
        IsComplete = contactGroup.Any(x => x.IsComplete),
        IsScheduled = contactGroup.Any(x => x.IsScheduled),
        IsStarted = contactGroup.Any(x => x.IsStarted),
        IsActive = activeContact != null,
        SentForReview = contactGroup.Any(x => x.IsActive && x.HasLeadSentForReview),
        ClinicianSignOff = contactGroup.Any(x => x.IsActive && x.HasClinicianSignedOff),
        LatestContact = contactGroup.Where(x => x.IsComplete).Max(x => x.ContactShiftStart),
        ScheduledContact = contactGroup.Where(x => x.IsActive).Min(x => x.ContactShiftStart),
        Guid = activeContact.GuidId
    };

return new JsonResult(result);

This is not final and best variant, but you should learn from this query haow to retrieve data effectively. Especially look at activeContact, probably you can reus it for other columns calculation. Also note that instead x.ContactShiftStart.Value.Year filtering i have used date rabge filter, it is more effective if you have indexe on ContactShiftStart column.