Unpacking generalizations

3.7k views Asked by At

PEP 448 -- Additional Unpacking Generalizations allowed:

>>> LOL = [[1, 2], ['three']]
>>> [*LOL[0], *LOL[1]]
[1, 2, 'three']

Alright! Goodbye itertools.chain. Never liked you much anyway.

>>> [*L for L in LOL]
  File "<ipython-input-21-e86d2c09c33f>", line 1
    [*L for L in LOL]
    ^
SyntaxError: iterable unpacking cannot be used in comprehension

Oh. Why can't we have nice things?

Unfortunately there are syntax errors for all of them:

[*l for l in lists]    # for l in lists: result.extend(l)
{*s for s in sets}     # for s in sets: result.update(s)
{**d for d in dicts}   # for d in dicts: result.update(d)
(*g for g in gens)     # for g in gens: yield from g

Unpacking in a comprehension seems to be obvious and Pythonic, and there's a quite natural extension from shorthand "for-loop & append" to "for-loop & extend".

But since they've bothered to add that special error message, there was presumably a reason for disabling it. So, what's the problem with that syntax?

2

There are 2 answers

3
Delgan On BEST ANSWER

This is briefly explained in the PEP 448 which introduces unpacking generalizations:

Earlier iterations of this PEP allowed unpacking operators inside list, set, and dictionary comprehensions as a flattening operator over iterables of containers:

>>> ranges = [range(i) for i in range(5)]
>>> [*item for item in ranges]
[0, 0, 1, 0, 1, 2, 0, 1, 2, 3]

>>> {*item for item in ranges}
{0, 1, 2, 3}

This was met with a mix of strong concerns about readability and mild support. In order not to disadvantage the less controversial aspects of the PEP, this was not accepted with the rest of the proposal.

However, this may change in the future:

This PEP does not include unpacking operators inside list, set and dictionary comprehensions although this has not been ruled out for future proposals.


The PEP mentions "strong concerns about readability". I don't know the entire story, but the detailed discussions that led to this decision can certainly be found in the mailing list:

Here is an ambiguous example if unpacking generalizations were to be allowed in list comprehension:

[*t for t in [(1, 'a'), (2, 'b'), (3, 'c')]]

According to one of the core developers, it would be surprising for the result to be [1, 'a', 2, 'b', 3, 'c'] and not [(1, 'a'), (2, 'b'), (3, 'c')].

Since there was no formal consensus, it was simpler not to allow these special cases.

1
Dimitris Fasarakis Hilliard On

Taking a quote from the Py-Dev mailing list thread in which this feature was accepted:

So that leaves comprehensions. IIRC, during the development of the patch we realized that f(*x for x in xs) is sufficiently ambiguous that we decided to disallow it -- note that f(x for x in xs) is already somewhat of a special case because an argument can only be a "bare" generator expression if it is the only argument. The same reasoning doesn't apply (in that form) to list, set and dict comprehensions -- while f(x for x in xs) is identical in meaning to f((x for x in xs)), [x for x in xs] is NOT the same as [(x for x in xs)] (that's a list of one element, and the element is a generator expression)

(Emphasis mine)

I also took a peek at the Python issue tracker for this feature. I found an issue in which discussion took place while implementing it. The sequence of messages that helped them come to this realization starts here with a nice overview of the ambiguity introduced presented in msg234766 by GvR.

In fear of link-rot, I'm attaching the (formatted) message here:

So I think the test function here should be:

def f(*a, **k): print(list(a), list(k))

Then we can try things like:

f(x for x in ['ab', 'cd'])

which prints a generator object, because this is interpreted as an argument that's a generator expression.

But now let's consider:

f(*x for x in ['ab', 'cd'])

I personally expected this to be equivalent to:

f(*'ab', *'cd')

IOW:

 f('a', 'b', 'c', 'd')

The PEP doesn't give clarity on what to do here. The question now is, should we interpret things like *x for x in ... as an extended form of generator expression, or as an extended form of *arg? I somehow think the latter is more useful and also the more logical extension.

My reasoning is that the PEP supports things like f(*a, *b) and it would be fairly logical to interpret f(*x for x in xs) as doing the *x thing for each x in the list xs.

Finally, as noted in the Abstract section of the corresponding PEP, this feature isn't completely ruled out:

This PEP does not include unpacking operators inside list, set and dictionary comprehensions although this has not been ruled out for future proposals.

So, we might get to see it sometime soon (definitely not 3.6, though :-) and I hope we do, they look nice.