.net core 6 - Entity Framework - when to use AsEnumerable AsQueryable

3.2k views Asked by At

Background

I've migrated my project from .net core 2.2 to .net core 6 few weeks ago. Over the past weeks I am seeing issues with Entity Framework calls that worked well in the past.

The Issue

The following code used to work well on .net core 2.2:

using (var context = new MyContext())
{
        return await context.Accounts
                    .AsNoTracking()
                    .Where(account => SOME_CONDITIONS)
                    .ToArrayAsync()
                    .ConfigureAwait(false);
}

We have found out that now this code is getting stuck in some kind of a deadlock, function does not return.

Suggested Solution

Adding AsAsyncEnumerable fixes the issues and the function finish executing very fast:

using (var context = new MyContext())
{
        return await context.Accounts
                    .AsNoTracking()
                    .AsAsyncEnumerable()
                    .Where(account => SOME_CONDITIONS)
                    .ToArrayAsync()
                    .ConfigureAwait(false);
}

The Question

I saw that it is recommended to use AsEnumerable/AsAsyncEnumerable/AsQueryable for calls using EF on .net core 6.

When should I use AsEnumerable/AsAsyncEnumerable/AsQueryable?

1

There are 1 answers

2
Steve Py On

The typical error you will see when moving from EF Core 2.x to 3.x+ around queries working then not working has to do with a feature in EF Core called client-side evaluation. What this essentially did was set it so that when EF encountered something it couldn't compose down to SQL (typically in the Where clause, but also could be in the Select) it would execute a query using what it could compose down to SQL at that point, and then re-run the rest against the entities in memory.

With EF Core 2.x this feature was enabled by default. In EF Core 3+ it was disabled by default. The danger of client side evaluation is similar to lazy loading, it is a fail-safe to help ensure your code ultimately executes, but it can hide potentially devastating performance issues, especially in cases with new systems that are developed without a real sense of production data loads. Nice and fast when you only have a few hundred test records, but blows up as production sets span millions of rows that the server is trying to load into memory.

So in your case the typical culprit will be something in your Where clause conditions. This is usually something like calling a method on an entity or using a C# function that EF cannot ultimately translate down to SQL.

Using AsEnumerable or ToList prior to the Where clause will essentially perform the client-side evaluation by loading the query into entities in memory befor attempting the Where clause. Obviously the cost here is you are loading all entities into memory. Alternatively you can opt to enable client-side evaluation and get roughly the same behaviour.

The best solution is to always seek to avoid anything that cannot evaluate down to SQL.

Other solutions to consider:

Move all Where conditions that can translate to SQL into a Where prior to the AsEnumerable. This will help ensure that the minimum amount of data is loaded for client-side evaluation.

For example if I had something like:

return await context.Accounts
    .AsNoTracking()
    .Where(a => a.CreatedAt >= startDate && a.CreatedAt < endDate && someEvaluation(a))
    .ToArrayAsync()
    .ConfigureAwait(false);

That someEvaluation method wouldn't be able to be composed down to SQL so it would require client-side evaluation. Using an AsEnumerable method prior to the Where clause would involve loading all accounts into memory. Instead, I could do this:

return await context.Accounts
    .AsNoTracking()
    .Where(a => a.CreatedAt >= startDate && a.CreatedAt < endDate)
    .AsEnumerableAsync()
    .Where(a => someEvaluation(a))
    .ToArrayAsync()
    .ConfigureAwait(false);

At least in this case the accounts that would be loaded into memory would be limited to the date range.

For complex conditional expressions it might not be that simple. Wherever possible you should strive to keep expressions pure to entity properties and computations that can ultimately boil down to SQL.