I need to get few counts:
SELECT
detail.detailId,
COUNT(*) as Total,
COUNT(CASE when t.Status = 1 and t.Type = 2 then 1 end) as TotalSub
...
FROM [GeneralDetails] AS [detail]
join Environment t on t.Id = detail.EnvironmentId
where
([detail].[EnvironmentId] IS NOT NULL)
group by detail.detailId
So in code I tried this, which seems to be pretty close to original query:
GeneralDetails
.Where(detail=> detail.EnvironmentId != null)
.GroupBy(detail=> detail.detailId)
.Select(detailGroup=> new
{
detailGroup.detailId,
Total = detailGroup.Count(),
TotalSub = detailGroup.Count(detail=>detail.Environment.Status == Status.Enabled && detail.Environment.Type == Type.General)
...
});
However, it generates this query instead, which is extremely slow:
SELECT
[s].[detailId],
COUNT(*) AS [Total],
-- with separate subquery for every counter similar to this
(
SELECT COUNT(*)
FROM [GeneralDetails] AS [s0]
LEFT JOIN [Environment] AS [t] ON [s0].[EnvironmentId] = [t].[Id]
WHERE ([s0].[EnvironmentId] IS NOT NULL) AND [s].[detailId] = [s0].[detailId] AND [t].[Status] = 1 AND [t].[Type] = 2) AS [ActiveSub]
...
FROM [GeneralDetails] AS [s]
WHERE [s].[EnvironmentId] IS NOT NULL
GROUP BY [s].[detailId]
What am I missing? How to force EF to generate a faster query?
Update Changing the code to
...
TotalSub = detailGroup.Sum(detail=> detail.Environment.Status == Status.Enabled && detail.Environment.Type == Type.General ? 1 : 0)
...
Produces equally slow query:
SELECT [s].[detailId], COUNT(*) AS [Total], (
SELECT COALESCE(SUM(CASE
WHEN [t].[Status] = 1 AND [t].[Type] = 2 THEN 1
ELSE 0
END), 0)
FROM [GeneralDetails] AS [s0]
LEFT JOIN [Environment] AS [t] ON [s0].[EnvironmentId] = [t].[Id]
WHERE ([s0].[EnvironmentId] IS NOT NULL) AND [s].[detailId] = [s0].[detailId]) AS [TotalSub]
FROM [GeneralDetails] AS [s]
WHERE [s].[EnvironmentId] IS NOT NULL
GROUP BY [s].[detailId]
One issue that trips up EF is dealing with optional relationships. A notable difference between your expected scenario is the
INNER JOIN
between GroupDetail and Environment where EF will use anOUTER JOIN
between the two because an Environment is optional, even though in this case you are only interested in GroupDetails that have an environment. There has been some back & forth on this about being able to designate this at the Query level but AFAIK nothing definitive has been done about it. One way you can force it that might help performance is to invert the query, but this is only an option if the relationship between GroupDetail and Environments is bi-directional, otherwise it requires an explicitJoin
:Without the bi-directional reference:
These should produce the desired inner joins between the tables.
Declaimer: These examples are written from memory so forgive if they aren't 100% syntactically correct, and they may not work, or work but not get as efficient a query as you like. Worst case you can always build a custom SQL statement for this edge case to get a query that performs the optimized query with the
INNER JOIN
.