About memory efficiency: range vs xrange, zip vs izip

2.7k views Asked by At

I was reading the following topic: Make dictionary from list with python

The initial problem is to transform the tuple (1,'a',2,'b',3,'c') into the dictionary {1: 'a', 2: 'b', 3: 'c'}. Many interesting solutions were given, including the following two:

Solution 1:

dict(x[i:i+2] for i in range(0, len(x), 2))

Solution 2:

dict(zip(*[iter(val_)] * 2))

In solution 1, why bother creating the actual list with range? Wouldn't xrange( 0, len(x), 2 ) be more memory efficient? Same question for solution 2: zip creates an actual list. Why not using itertools.izip instead?

2

There are 2 answers

2
Hrishi On

As far as I know

dict(zip(*[iter(val_)] * 2))

is the usual "Pythonic" way of doing it. And the approach in Python when it comes to optimizing stuff is to always profile and see where time is being spent. If the above approach works fine for your application, why optimize it?

0
Raymond Hettinger On

Why bother creating the actual list with range?

Yes, xrange(0, len(x), 2) be more memory efficient.

Why not use itertools.izip() in Solution 2?

Yes, zip() creates an actual list, so you can save memory by using itertools.izip.

Does this really make a difference?

The speed differences are likely to be small. Memory efficiency translates to improved speed only when the data exceeds the size of memory caches. Some of the benefit is offset by the overhead of iterators.

Since the dictionary is storing the keys and values, the only memory saved in the tuples pointing to the keys and values. So the savings in this situation is much more modest than it is for other iterator applications which don't accumulate all the results in a container.

So this is all likely "much ado about nothing".

What about Python 3?

In Python 3, range() and zip() both return iterators.