In Python the map function is lazy, but most often I need an eager map.
For example, trying to slice a map object results in an error:
>>>> map(abs, [3, -1, -4, 1])[1:]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'map' object is not subscriptable (key slice(1, None, None))
I guess I need to implement an eager map myself, so I wonder whether there's a standard way of doing that in Python.
I managed to implement it in a few different ways, but I'm not sure what alternative should be preferred. I'm asking for both CPython and for PyPy3, if the answer varies depending on the Python implementation, I'd prefer to know about all relevant options.
These are my implementations:
def eager_map_impl0(f, *collections):
return list(map(f, *collections))
def eager_map_impl1(f, *collections):
return [x for x in map(f, *collections)]
def eager_map_impl2(f, *collections):
return [*map(f, *collections)]
def eager_map_impl3(f, *collections):
return [f(*x) for x in zip(*collections)]
Usage example:
>>>> eager_map_impl0(abs, [3, -1, -4, 1])[1:]
[1, 4, 1]
>>>> eager_map_impl1(abs, [3, -1, -4, 1])[1:]
[1, 4, 1]
>>>> eager_map_impl2(abs, [3, -1, -4, 1])[1:]
[1, 4, 1]
>>>> eager_map_impl3(abs, [3, -1, -4, 1])[1:]
[1, 4, 1]
Regarding the duplicate vote, the linked question and some of its answers are interesting, but not an answer here, I think. I already know I want to use map, not list comprehensions; so I was hoping someone would say what is the most performant implementation in CPython vs Pypy as an answer here.
Calling
list(map(...)), as in your first example, is enough for what you want, and you don't even need another function just for wrapping that - as its intent is obvious.The performance difference of using a list vs a comprehension in this case should be minimal, but with advantages on the list side, as no Python VM ops need to be executed - the list builder will call
__next__in the map iterator directly in native code. For Pypy that is unpredictable since for sequences long enough to be meaningfull, JIT will be triggered, and it may have ways of its own. (anyway, the difference should not be relevant)Now, at times, one might want to just apply a
mapand not store the results, in the cases the mapping function performs I/O or have other side-effects. In this case, the best-performant approach is little documented: thecollections.dequestructure, with a maxlen of 0 is actually internally optimized to consume all items in an iterator, and would process all your items: