why is xrange able to go back to beginning in Python?

1.5k views Asked by At

I've encountered this code from Most pythonic way of counting matching elements in something iterable

r = xrange(1, 10)
print sum(1 for v in r if v % 2 == 0) # 4
print sum(1 for v in r if v % 3 == 0) # 3

r is iterated once. and then it's iterated again. I thought if an iterator is once consumed then it's over and it should not be iterated again.

Generator expressions can be iterated only once:

r = (7 * i for i in xrange(1, 10))
print sum(1 for v in r if v % 2 == 0) # 4
print sum(1 for v in r if v % 3 == 0) # 0

enumerate(L) too:

r = enumerate(mylist)

and file object too:

f = open(myfilename, 'r')

Why does xrange behave differently?

3

There are 3 answers

0
Amber On BEST ANSWER

Because the xrange object produced by calling xrange() specifies an __iter__ that provides a unique version of itself (actually, a separate rangeiterator object) each time it's iterated.

>>> x = xrange(3)
>>> type(x)
<type 'xrange'>
>>> i = x.__iter__()
>>> type(i)
<type 'rangeiterator'>
0
senderle On

Because xrange does not return a generator. It returns an xrange object.

>>> type(xrange(10))
<type 'xrange'>

In addition to repeated iteration, xrange objects support other things that generators don't -- like indexing:

>>> xrange(10)[5]
5

They also have a length:

>>> len(xrange(10))
10

And they can be reversed:

>>> list(reversed(xrange(10)))
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

In short, xrange objects implement the full sequence interface:

>>> import collections
>>> isinstance(xrange(10), collections.Sequence)
True

They just do it without using up a lot of memory.

Note also that in Python 3, the range object returned by range has all the same properties.

0
Ben On

If all you know about something is that it's an iterator, then in general you must assume you can only iterate over it once. That doesn't meant that every iterator can only be consumed once, just that every iterator can be consumed at least once. The obvious example is that lists and other sequences support this interface.

As senderle and Amber have explained, the particular iterators you get by calling xrange happen to be implemented such that you can iterate over them multiple times.

The general iterator idea allows that iterators may be exhausted after being iterated. This is because many iterators (such as generators, file traversal, etc) would be difficult to implement, or consume much more memory or run much slower, if they had to support arbitrarily many traversals, and very often this functionality wouldn't even be used. So if iterators had to support arbitrarily many traversals, then these things probably wouldn't be iterators.

Long story short, if you're writing code that operates on an arbitrary unknown iterator, you assume it can only be traversed once, and it doesn't matter if someone gives you an object that supports more than the functionality you need. If you know some additional information about the iterator (such as that it's also a sequence, or even as much as that it's an xrange object), then you can code to make use of that if you want.