class Foo:
def __getitem__(self, item):
print('getitem', item)
if item == 6:
raise IndexError
return item**2
def __len__(self):
print('len')
return 3
class Bar:
def __iter__(self):
print('iter')
return iter([3, 5, 42, 69])
def __len__(self):
print('len')
return 3
Demo:
>>> list(Foo())
len
getitem 0
getitem 1
getitem 2
getitem 3
getitem 4
getitem 5
getitem 6
[0, 1, 4, 9, 16, 25]
>>> list(Bar())
iter
len
[3, 5, 42, 69]
Why does list call __len__? It doesn't seem to use the result for anything obvious. A for loop doesn't do it. This isn't mentioned anywhere in the iterator protocol, which just talks about __iter__ and __next__.
Is this Python reserving space for the list in advance, or something clever like that?
(CPython 3.6.0 on Linux)
See the Rationale section from PEP 424 that introduced
__length_hint__and offers insight on the motivation:In addition to that, the documentation for
object.__length_hint__verifies the fact that this is purely an optimization feature:So
__length_hint__is here because it can result in some nice optimizations.PyObject_LengthHint, first tries to get a value fromobject.__len__(if it is defined) and then tries to see ifobject.__length_hint__is available. If neither is there, it returns a default value of8for lists.listextend, which is called fromlist_initas Eli stated in his answer, was modified according to this PEP to offer this optimization for anything that defines either a__len__or a__length_hint__.listisn't the only one that benefits from this, of course,bytesobjects do:so do
bytearrayobjects but, only when youextendthem:and
tupleobjects which create an intermediary sequence to populate themselves:If anybody is wandering why exactly
'iter'is printed before'len'in classBarand not after as happens with classFoo:This is because if the object in hand defines an
__iter__Python will first call it to get the iterator, thereby running theprint('iter')too. The same doesn't happen if it falls back to using__getitem__.