xrange as an iterator and chunking

898 views Asked by At

The snippets

 xi = xrange(10)
 zip(xi,xi)

and

 xi = iter(range(10))
 zip(xi,xi)

behave differently. I expected to get

  [(0, 1), (2, 3), (4, 5), (6, 7), (8, 9)]

in the first snippet as well, but it returns

[(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6), (7, 7), (8, 8), (9, 9)]

instead. It seems the implicit container is being silently copied. Could anyone explain whats going on here ? and reasoning behind choosing such semantics.

>>> sys.version
'2.7.9 (default, Dec 10 2014, 12:28:03) [MSC v.1500 64 bit (AMD64)]'
3

There are 3 answers

0
user2357112 On BEST ANSWER

An xrange is not any sort of iterator. People keep calling it a generator, but it's not; an xrange is an immutable sequence, like a tuple:

>>> x = xrange(5)
>>> x[2]
2
>>> for i in x:
...     print i
...
0
1
2
3
4
>>> for i in x:
...     print i
...
0
1
2
3
4

As with any other sequence type, each time you request an iterator from an xrange, you get a new, independent iterator. Thus, when you zip xrange(10) with itself, you get the same output as if you had zipped [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] with itself, rather than if you had zipped iter([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) with itself.

2
Mazdak On

I think you have a miss understanding about xrange types. based on pythons documentation about xrange type :

The xrange type is an immutable sequence which is commonly used for looping. The advantage of the xrange type is that an xrange object will always take the same amount of memory, no matter the size of the range it represents. There are no consistent performance advantages.

And following important part :

XRange objects have very little behavior: they only support indexing, iteration, and the len() function.

The xrange object is not an iterator its just an opaque sequence type which yields the same values as the corresponding list, without actually storing them all simultaneously,Since you can not access to an iterator elements without traversing its preceding items and it doesn't supports operations like indexing or len() function.and when you traverse the iterator you can not go back!!

So in the second code the zip function consume each item at each iteration and it just can access to the next item.

0
Bas Swinckels On

The xrange object is not already an iterator, but an iterable object. You get the iterator by feeding it to the iter function. The zip function implicitly calls iter on all of its arguments, so it produces two parallel iterators over the xrange object. In the second example, you call iter once by hand, so you are comparing apples to oranges. To get the effect you want also for xrange, you should do

In [5]: it = iter(xrange(10))
   ...: zip(it, it)
Out[5]: [(0, 1), (2, 3), (4, 5), (6, 7), (8, 9)]