Why is the error handling for IEnumerator.Current different from IEnumerator<T>.Current?

4.1k views Asked by At

I would have thought that executing the following code for an empty collection that implements IEnumerable<T> would throw an exception:

var enumerator = collection.GetEnumerator();
enumerator.MoveNext();
var type = enumerator.Current.GetType(); // Surely should throw?

Because the collection is empty, then accessing IEnumerator.Current is invalid, and I would have expected an exception. However, no exception is thrown for List<T>.

This is allowed by the documentation for IEnumerator<T>.Current, which states that Current is undefined under any of the following conditions:

  • The enumerator is positioned before the first element in the collection, immediately after the enumerator is created. MoveNext must be called to advance the enumerator to the first element of the collection before reading the value of Current.
  • The last call to MoveNext returned false, which indicates the end of the collection.
  • The enumerator is invalidated due to changes made in the collection, such as adding, modifying, or deleting elements.

(I'm assuming that "fails to throw an exception" can be categorised as "undefined behaviour"...)

However, if you do the same thing but use an IEnumerable instead, you DO get an exception. This behaviour is specified by the documentation for IEnumerator.Current, which states:

  • Current should throw an InvalidOperationException if the last call to MoveNext returned false, which indicates the end of the collection.

My question is: Why this difference? Is there a good technical reason that I'm unaware of?

It means identical-seeming code can behave very differently depending on whether it's using IEnumerable<T> or IEnumerable, as the following program demonstrates (note how the code inside showElementType1() and showElementType1() is identical):

using System;
using System.Collections;
using System.Collections.Generic;

namespace ConsoleApplication2
{
    class Program
    {
        public static void Main()
        {
            var list = new List<int>();

            showElementType1(list); // Does not throw an exception.
            showElementType2(list); // Throws an exception.
        }

        private static void showElementType1(IEnumerable<int> collection)
        {
            var enumerator = collection.GetEnumerator();
            enumerator.MoveNext();
            var type = enumerator.Current.GetType(); // No exception thrown here.
            Console.WriteLine(type);
        }

        private static void showElementType2(IEnumerable collection)
        {
            var enumerator = collection.GetEnumerator();
            enumerator.MoveNext();
            var type = enumerator.Current.GetType(); // InvalidOperationException thrown here.
            Console.WriteLine(type);
        }
    }
}
2

There are 2 answers

18
Patrick Hofman On BEST ANSWER

The problem with IEnumerable<T> is that Current is of type T. Instead of throwing an exception, default(T) is returned (it is set from MoveNextRare).

When using IEnumerable you don't have the type, and you can't return a default value.

The actual problem is you don't check the return value of MoveNext. If it returns false, you shouldn't call Current. The exception is okay. I think they found it more convenient to return default(T) in the IEnumerable<T> case.

Exception handling brings overhead, returning default(T) doesn't (that much). Maybe they just thought there was nothing useful to return from the Current property in the case of IEnumerable (they don't know the type). That problem is 'solved' in IEnumerable<T> when using default(T).

According to this bug report (thanks Jesse for commenting):

For performance reasons the Current property of generated Enumerators is kept extremely simple - it simply returns the value of the generated 'current' backing field.

This could point in the direction of the overhead of exception handling. Or the required extra step to validate the value of current.

They effectively just wave the responsibility to foreach, since that is the main user of the enumerator:

The vast majority of interactions with enumerators are in the form of foreach loops which already guard against accessing current in either of these states so it would be wasteful to burn extra CPU cycles for every iteration to check for these states that almost no one will ever encounter.

0
Jon Hanna On

To better match how people tend to implement it in practice. As is the change of wording from "Current also throws an exception …" in previous versions of the documentation to "Current should throw …" in the current version.

Depending on how the implementation is working, throwing an exception might be quite a bit of work, and yet because of how Current is used in conjunction with MoveNext() then that exceptional state is hardly ever going to come up. This is all the more so when we consider that the vast majority of uses are compiler-generated and don't actually have scope for a bug where Current is called before MoveNext() or after it has returned false to ever happen. With normal use we can expect the case to never come up.

So if you are writing an implementation of IEnumerable or IEnumerable<T> where catching the error condition is tricky, you might well decide not to do so. And if you do make that decision, it probably isn't going to cause you any problems. Yes, you broke the rules, but it probably didn't matter.

And since it won't cause any problems except for someone who uses the interface in a buggy way, documenting it as undefined behaviour moves the burden from the implementer to the caller to not do something the caller shouldn't be doing in the first place.

But all that said, since IEnumerable.Current is still documented as "should throw InvalidOperationException for backwards compatibility and since doing so would match the "undefined" behaviour of IEnumerable<T>.Current, the probably the best way to perfectly fulfil the documented behaviour of the interface is to have IEnumerable<T>.Current throw an InvalidOperationException in such cases, and have IEnumerable.Current just call into that.

In a way this is the opposite to the fact that IEnumerable<T> also inherits from IDisposable. The compiler-generated uses of IEnumerable will check if the implementation also implements IDisposable and call Dispose() if it does, but aside from the slight performance overhead of that test it meant that both implementers and hand-coded callers would sometimes forget about that, and not implement or call Dispose() when they should. Forcing all implementations to have at least an empty Dispose() made life easier for people in the opposite way to just having Current have undefined behaviour when it isn't valid.

If there were no backwards compatibility issues then we would probably have Current documented as undefined in such cases for both interfaces, and both interfaces inheriting from IDisposable. We probably also wouldn't have Reset() which is nothing but a nuisance.