Linq2SQL group by a property that is known at runtime

60 views Asked by At

I am trying to return some report data based on a user input frequency (daily, monthly, yearly). My LINQ is the following:

var dataQuery = DataAccess.AppEventRepository.AllNoTracking;

// get property func based on provided filter
Func<AppEvent, DateTime?> timeGroupProp = null;
if (filters.TimeSpanId == (int)TimeSpanEnum.Daily) timeGroupProp = e => e.InsertDay;
if (filters.TimeSpanId == (int) TimeSpanEnum.Monthly) timeGroupProp = e => e.InsertMonth;
if (filters.TimeSpanId == (int) TimeSpanEnum.Yearly) timeGroupProp = e => e.InsertYear;

 var groupedDataQuery = dataQuery
    // downgrading to LINQ2Object, because Invoke is not supported in LINQ2SQL
    .ToList()
    .GroupBy(e => new {InsertGroupProp = timeGroupProp?.Invoke(e), e.CountryId})
    .Select(grp => new AuditReportGroupingDataModel
    {
        GroupTime = grp.Key.InsertGroupProp.Value,
        CountryId = grp.Key.CountryId.Value,
        Count = grp.Count()
     });

This works correctly, but the problem is that grouping is done after all data is retrieved from the SQL. The number of events may grow to hundred of thousands in the future and I expect performance degradation.

Question: is it possible to write my query so that I am grouping on the server-side level? (full use of LINQ2SQL, not downgrading to LINQ2Object)

1

There are 1 answers

0
Alexei - check Codidact On

I managed to find two ways of doing this, but not something so small as Invoke trial.

0) Some POCOs I am using to store the data

public class AuditReportGroupingDataModelBase
{
    public DateTime GroupTime { get; set; }
    public int CountryId { get; set; }
}

public class AuditReportGroupingDataModel : AuditReportGroupingDataModelBase
{
    public int Count { get; set; }
}

1) The ugly way - using conditional operator in GroupBy

My small number of possibilities allow ternary operator usage. However, this does not work properly for an increased number of options.

var groupedDataQuery = dataQuery
    .GroupBy(e => new AuditReportGroupingDataModelBase
    {
        GroupTime = (filters.TimeSpanId == (int)TimeSpanEnum.Daily ? e.InsertDay : filters.TimeSpanId == (int)TimeSpanEnum.Monthly ? e.InsertMonth : e.InsertDay).Value,
        CountryId = e.CountryId.Value
    })
    .Select(grp => new AuditReportGroupingDataModel
    {
        GroupTime = grp.Key.GroupTime,
        CountryId = grp.Key.CountryId,
        Count = grp.Count()
    });

This works, but generates an ugly and not so efficient SQL statement:

exec sp_executesql N'SELECT 
    1 AS [C1], 
    [GroupBy1].[K2] AS [C2], 
    [GroupBy1].[K1] AS [CountryId], 
    [GroupBy1].[A1] AS [C3]
    FROM ( SELECT 
        [Filter1].[K1] AS [K1], 
        [Filter1].[K2] AS [K2], 
        COUNT([Filter1].[A1]) AS [A1]
        FROM ( SELECT 
            [Extent1].[CountryId] AS [K1], 
            CASE WHEN (1 = @p__linq__0) THEN [Extent1].[InsertDay] WHEN (2 = @p__linq__1) THEN [Extent1].[InsertMonth] ELSE [Extent1].[InsertDay] END AS [K2], 
            1 AS [A1]
            FROM [dbo].[AppEvent] AS [Extent1]
            WHERE ([Extent1].[EventTypeId] IN (1)) AND ([Extent1].[CountryId] IS NOT NULL)
        )  AS [Filter1]
        GROUP BY [K1], [K2]
    )  AS [GroupBy1]',N'@p__linq__0 int,@p__linq__1 int',@p__linq__0=1,@p__linq__1=1

2) Better way - GroupBy expression based on value

        IQueryable<IGrouping<AuditReportGroupingDataModelBase, AppEvent>> groupedDataQueryInterm = null;
        if (filters.TimeSpanId == (int)TimeSpanEnum.Daily) groupedDataQueryInterm = dataQuery.GroupBy(e => new AuditReportGroupingDataModelBase { GroupTime = e.InsertDay.Value, CountryId = e.CountryId.Value });
        if (filters.TimeSpanId == (int)TimeSpanEnum.Monthly) groupedDataQueryInterm = dataQuery.GroupBy(e => new AuditReportGroupingDataModelBase { GroupTime = e.InsertMonth.Value, CountryId = e.CountryId.Value });
        if (filters.TimeSpanId == (int)TimeSpanEnum.Yearly) groupedDataQueryInterm = dataQuery.GroupBy(e => new AuditReportGroupingDataModelBase { GroupTime = e.InsertYear.Value, CountryId = e.CountryId.Value });
        if (groupedDataQueryInterm == null)
            throw new InvalidEnumArgumentException($@"Invalid value provided to {nameof(filters.TimeSpanId)}");

        var groupedDataQuery = groupedDataQueryInterm
            .Select(grp => new AuditReportGroupingDataModel
            {
                GroupTime = grp.Key.GroupTime,
                CountryId = grp.Key.CountryId,
                Count = grp.Count()
            })

This generates better SQL:

SELECT 
    1 AS [C1], 
    [GroupBy1].[K2] AS [InsertDay], 
    [GroupBy1].[K1] AS [CountryId], 
    [GroupBy1].[A1] AS [C2]
    FROM ( SELECT 
        [Extent1].[CountryId] AS [K1], 
        [Extent1].[InsertDay] AS [K2], 
        COUNT(1) AS [A1]
        FROM [dbo].[AppEvent] AS [Extent1]
        WHERE ([Extent1].[EventTypeId] IN (1)) AND ([Extent1].[CountryId] IS NOT NULL)
        GROUP BY [Extent1].[CountryId], [Extent1].[InsertDay]
    )  AS [GroupBy1]