This came up in a nitpick discussion about the prefered style for iterating over dictionary keys if you need to apply some test to the value. I was comparing the performance of [k for k in d if d[k] == 1]
against [k for k, v in d.items() if v == 1]
.
Looks like dictionary lookups are more expensive in Python 3.4:
$ python -m timeit -n 1000000 \
-s 'd={k:v for v, k in enumerate("abcdefghijhklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")}' \
'[k for k in d if d[k] == 1]'
1000000 loops, best of 3: 2.17 usec per loop
$ python -m timeit -n 1000000 \
-s 'd={k:v for v, k in enumerate("abcdefghijhklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")}' \
'[k for k, v in d.items() if v == 1]'
1000000 loops, best of 3: 3.13 usec per loop
$ python3.4 -m timeit -n 1000000 \
-s 'd={k:v for v, k in enumerate("abcdefghijhklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")}' \
'[k for k in d if d[k] == 1]'
1000000 loops, best of 3: 3.25 usec per loop
$ python3.4 -m timeit -n 1000000 \
-s 'd={k:v for v, k in enumerate("abcdefghijhklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")}' \
'[k for k, v in d.items() if v == 1]'
1000000 loops, best of 3: 3.05 usec per loop
Are lookups more expensive in Python 3.4 compared to 2.7 and can you explain why?
It's not that lookups are more expensive1 in Python 3.4 than 2.7, but that
dict.items()
is cheaper. That's becausedict.items()
is a rather different beast in the two versions of the language:In Python 2,
dict.items()
constructs and returns a list, whereas in Python 3 it returns a dictionary view, and it turns out that iterating over this dynamic view into the dictionary is faster than constructing a list and then iterating over that.Although
dict.items()
doesn't return a dictionary view in 2.7, it is possible to get one withdict.viewitems()
, with similar performance benefits. Repeating your test, this time withviewitems()
included, I get:As regards the discussion that prompted your investigation, I'd say that the
d.items()
ord.viewitems()
approach is more explicit, and therefore more pythonic, but that's really more an aesthetic choice than anything else.1 Except inasmuch as Python 3.x is, as a general rule, slower than 2.x, but that's the price of progress ...