Python: an iteration over a non-empty list with no if-clause comes up empty. Why?

779 views Asked by At

How can an iterator over a non-empty sequence, with no filtering and no aggregation (sum(), etc.), yield nothing?

Consider a simple example:

sequence = ['a', 'b', 'c']
list((el, ord(el)) for el in sequence)

This yields [('a', 97), ('b', 98), ('c', 99)] as expected.

Now, just swap the ord(el) out for an expression that takes the first value out of some generator using (...).next() — forgive the contrived example:

def odd_integers_up_to_length(str):
    return (x for x in xrange(len(str)) if x%2==1)

list((el, odd_integers_up_to_length(el).next()) for el in sequence)

This yields []. Yeah, empty list. No ('a',stuff) tuples. Nothing.

But we're not filtering or aggregating or reducing. A generator expression over n objects without filtering or aggregation must yield n objects, right? What's going on?

4

There are 4 answers

1
Devin Jeanpierre On BEST ANSWER

odd_integers_up_to_length(el).next() will raise StopIteration, which isn't caught there, but is caught for the generator expression within it, stopping it without ever yielding anything.

look at the first iteration, when the value is 'a':

>>> odd_integers_up_to_length('a').next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
0
Gunnlaugur Briem On

What happens is that the next() call raises a StopIteration exception, which bubbles up the stack to the outer generator expression and stops that iteration.

A StopIteration is the normal way for an iterator to signal that it's done. Generally we don't see it, because generally the next() call occurs within a construct that consumes the iterator, e.g. for x in iterator or sum(iterator). But when we call next() directly, we are the ones responsible for catching the StopIteration. Not doing so springs a leak in the abstraction, which here leads to unexpected behavior in the outer iteration.

The lesson, I suppose: be careful about direct calls to next().

2
Martin On

str is a reserved keword, you should name your variable differently

I was also to advise about the next

0
dwc On
>>> seq=['a','b','c']
>>> list((el,4) for el in seq)
[('a',4), ('b',4), ('c',4)]

So it's not list giving you trouble here...