Why is linq to object implementing iterators manually?

136 views Asked by At

While browsing the .net core source i notice that even in source form iterator classes are manually implemented instead of relying on the yield statement and auto IEnumerable implementation.

You can see at this line for example the decalartion and implementation of the where iterator https://github.com/dotnet/corefx/blob/master/src/System.Linq/src/System/Linq/Enumerable.cs#L168

I'm assuming if they went through the trouble of doing this instead of a simple yield statement there has to be some benefit but i can't immediately see which, it seems pretty similar to what i remember the compiler does automatically from i read back eric lippert's blog a few years back and i remember when i naively reimplemented LINQ with yield statements in it's early days to better understand it the performance profile was similar to that of .NET version.

It piqued my curiosity but it's also an actually important question as i'm in the middle of a fairly big data - in memory project and if i'm missing something obvious that makes this approach better i would love to know the tradeoffs.

Edit : to clarify, i do understand why they can't just yield in the where method (different enumeration for different container types), what i don't understand is why they implement the iterator itself (that is, instead of forking to diferent iterators, forking to diferent methods, iterating diferently based on type, and yielding to have the auto implemented state machine instead of manual case 1 goto 2 case 2 etc).

1

There are 1 answers

5
Theodoros Chatzigiannakis On BEST ANSWER

One possible reason is that the specialized iterators perform a few optimizations, like combining selectors and predicates and taking advantage of indexed collections.

The usefulness of these is that some sources can be iterated in a more optimal way than what the compiler magic for yield would generate. And by creating these custom iterators, they can pass this extra information about the source to subsequent LINQ operations in a single chain (instead of making that information available only to the first operation in the chain). Thus, all Where and Select operations (that don't have anything else between them) can be executed as one.