EF generates inefficient query

102 views Asked by At

I need to get few counts:

SELECT
detail.detailId,
COUNT(*) as Total,
COUNT(CASE when t.Status = 1 and t.Type = 2 then 1 end) as TotalSub
...

FROM [GeneralDetails] AS [detail]
join Environment t on t.Id = detail.EnvironmentId 
    where
    ([detail].[EnvironmentId] IS NOT NULL)

    group by detail.detailId

So in code I tried this, which seems to be pretty close to original query:

GeneralDetails
    .Where(detail=> detail.EnvironmentId != null)
    .GroupBy(detail=> detail.detailId)
    .Select(detailGroup=> new
        {
        detailGroup.detailId,
        Total = detailGroup.Count(),
        TotalSub = detailGroup.Count(detail=>detail.Environment.Status == Status.Enabled && detail.Environment.Type == Type.General)
        ...
        });

However, it generates this query instead, which is extremely slow:

  SELECT
   [s].[detailId],
    COUNT(*) AS [Total],
    -- with separate subquery for every counter similar to this
    (
    SELECT COUNT(*)
    FROM [GeneralDetails] AS [s0]
    LEFT JOIN [Environment] AS [t] ON [s0].[EnvironmentId] = [t].[Id]
    WHERE ([s0].[EnvironmentId] IS NOT NULL) AND [s].[detailId] = [s0].[detailId] AND [t].[Status] = 1 AND [t].[Type] = 2) AS [ActiveSub]
    ...
    FROM [GeneralDetails] AS [s]
WHERE [s].[EnvironmentId] IS NOT NULL
GROUP BY [s].[detailId]

What am I missing? How to force EF to generate a faster query?

Update Changing the code to

...
TotalSub = detailGroup.Sum(detail=> detail.Environment.Status == Status.Enabled && detail.Environment.Type == Type.General ? 1 : 0)
...

Produces equally slow query:

SELECT [s].[detailId], COUNT(*) AS [Total], (
    SELECT COALESCE(SUM(CASE
        WHEN [t].[Status] = 1 AND [t].[Type] = 2 THEN 1
        ELSE 0
    END), 0)
    FROM [GeneralDetails] AS [s0]
    LEFT JOIN [Environment] AS [t] ON [s0].[EnvironmentId] = [t].[Id]
    WHERE ([s0].[EnvironmentId] IS NOT NULL) AND [s].[detailId] = [s0].[detailId]) AS [TotalSub]
FROM [GeneralDetails] AS [s]
WHERE [s].[EnvironmentId] IS NOT NULL
GROUP BY [s].[detailId]
2

There are 2 answers

0
Steve Py On

One issue that trips up EF is dealing with optional relationships. A notable difference between your expected scenario is the INNER JOIN between GroupDetail and Environment where EF will use an OUTER JOIN between the two because an Environment is optional, even though in this case you are only interested in GroupDetails that have an environment. There has been some back & forth on this about being able to designate this at the Query level but AFAIK nothing definitive has been done about it. One way you can force it that might help performance is to invert the query, but this is only an option if the relationship between GroupDetail and Environments is bi-directional, otherwise it requires an explicit Join:

var details = _context.Environments
    .SelectMany(e => e.GroupDetails)
    .GroupBy(detail => detail.detailId)
    .Select(detailGroup=> new
    {
        DetailId = detailGroup.Key,
        Total = detailGroup.Count(),
        TotalSub = detailGroup.Count(detail=>detail.Environment.Status == Status.Enabled && detail.Environment.Type == Type.General)
        ...
    }).SingleOrDefault();

Without the bi-directional reference:

var details = _context.Environments
    .GroupJoin(_context.GeneralDetails,
         e => e.EnvironmentId,
         gd => gd.EnvironmentId,
         (e, gd) => gd)
    .GroupBy(gd => gd.detailId)
    .Select(detailGroup=> new
    {
        DetailId = detailGroup.Key,
        Total = detailGroup.Count(),
        TotalSub = detailGroup.Count(detail=>detail.Environment.Status == Status.Enabled && detail.Environment.Type == Type.General)
        ...
    }).SingleOrDefault();

These should produce the desired inner joins between the tables.

Declaimer: These examples are written from memory so forgive if they aren't 100% syntactically correct, and they may not work, or work but not get as efficient a query as you like. Worst case you can always build a custom SQL statement for this edge case to get a query that performs the optimized query with the INNER JOIN.

0
Svyatoslav Danyliv On

Yua are facing EF Core's limitation with usage of navigation properties in aggregation functions. As workaround you should project navigation properties in GroupBy operator using proper overload:

GeneralDetails
    .Where(detail => detail.EnvironmentId != null)
    .GroupBy(detail => detail.detailId, detail => new { detail, detail.Environment })
    .Select(detailGroup => new
    {
        detailId = detailGroup.Key,
        Total = detailGroup.Count(),
        TotalSub = detailGroup.Count(x => x.Environment.Status == Status.Enabled && x.Environment.Type == Type.General)
        ...
    });