I had a scaling problem in a production environment. A small numbers of records processed fine. But larger volumes took exponentially longer. These classes are just to illustrate the point.
Party {
DateTime When { get; set; }
ICollection<Attendee> Attendees { get; set; }
}
Attendee {
ICollection<Gift> Gifts { get; set; }
ICollection<Allergy> Allergies { get; set; }
}
IEnumerable<Party> GetAllPartiesByDate(DateTime date) {
var parties = Context.Parties
.Include(p => p.Attendees).ThenInclude(a => a.Gifts)
.Include(p => p.Attendees).ThenInclude(a => a.Allergies)
.Where(p.When == date)
.ToList();
return parties;
}
With 4 matching parties, and 7 attendees at each party, where each attendee has 3 gifts and 2 allergies
That's 172 rows in the DB across 4 tables
4 + (4*7) + (4*7*3) + (4*7*2)
EF returns that via a single SQL query with 168 rows, not too bad.
4 * (7) * (3) * (2)
But make each of those values just 10 times larger and you get 142,840 rows in the DB
40 + (40*70) + (40*70*30) + (40*70*20)
But the results set from the singe EF query explodes on each successive one to many relationship and tries to return 40 * 70 * 30 * 20 1,680,000 rows
When using more modern versions of Entity Framework Core the solution to a Cartesian Explosion is to use AsSplitQuery
For unfortunate technical reasons we cannot update Entity Framework Core past version 3.1.
But how can you implement AsSplitQuery in EF Core 3.1?
The solution is to:
Remember AutoDetectChangesEnabled is not the same mechanism as QueryTrackingBehavior.NoTracking aka AsNoTracking
You will likely need to use a mixture of the two strategies depending on the expected volumes for each relationship you are dealing with.