Instantiation time for mutable default arguments of closures in Python

130 views Asked by At

My understanding is that when Python parses the source code of a function, it compiles it to bytecode but doesn't run this bytecode before the function is called (which is why illegal variable names in functions does not throw an exception unless you call the function).

Default arguments are not instantiated during this initial setup of the function, but only when the function is called for the first time, regardless of whether the arguments are supplied or not. This same instance of the default argument is used for all future calls, which can be seen by using a mutable type as a default argument.

If we put the function inside of another function, however, the default argument now seems to be re-instantiated each time the outer function is called, as the following code shows:

def f(x):
    def g(y, a=[]):
        a.append(y)
        return a

    for y in range(x, x + 2):
        print('calling g from f:', g(y))
    return g(y + 1)

for x in range(2):
    print('calling f from module scope:', f(x))

This prints out

calling g from f: [0]
calling g from f: [0, 1]
calling f from module scope: [0, 1, 2]
calling g from f: [1]
calling g from f: [1, 2]
calling f from module scope: [1, 2, 3]

Does this mean that every time f is called, the bytecode of g is rebuild? This behavior seems unnecessary, and weird since the bytecode of f (which include g?) is only build once. Or perhaps it is only the default argument of g which is reinstantiated at each call to f?

3

There are 3 answers

6
juanpa.arrivillaga On BEST ANSWER

First misconception: "when Python parses the source code of a function, it compiles it to bytecode but doesn't run this bytecode before the function is called (which is why illegal variable names in functions does not throw an exception unless you call the function)." To be clear, your misconception is that "illegal variable names in functions does not throw an exception unless you call the function". Unassigned names will not be caught until the function is executed.

Check out this simple test:

In [1]: def g(a):
   ...:     123onetwothree = a
  File "<ipython-input-5-48a83ac30c7b>", line 2
    123onetwothree = a

Second misconception: "default arguments are not instantiated during this initial setup of the function, but only when the function is called for the first time...". This is incorrect.

In [7]: def f(x=[]):
   ...:     print(x)
   ...:     x.append(1)
   ...:     print(x)
   ...:
   ...:

In [8]: f.__defaults__
Out[8]: ([],)

In [9]: f()
[]
[1]

In [10]: f.__defaults__
Out[10]: ([1],)

In [11]:

As for your example, every time you run f the default argument is reinstantiated because you define g inside f. The best way to think of it is to think of the def statement as a constructor for function objects, and the default arguments like parameters to this constructor. Every time you run def some_function it is like calling the constructor all over again, and the function is redefined as if had written g = function(a=[]) in the body of f.

In response to comment

In [11]: def f(x=h()): pass
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-11-ecbb4a9f8673> in <module>()
----> 1 def f(x=h()): pass

NameError: name 'h' is not defined
0
Shain On

The inner function is rebuilt using existing bytecode for the inner function. It's easy to see using dis.

>>> import dis
>>> def make_func():
...     def my_func():
...         pass
...     return my_func
>>> dis.dis(make_func.__code__)
  3       0 LOAD_CONST               1 (<code object my_func at [...]", line 3>)
          3 MAKE_FUNCTION            0
          6 STORE_FAST               0 (my_func)

  5       9 LOAD_FAST                0 (my_func)
         12 RETURN_VALUE

Now if you do:

>>> f1 = make_func()
>>> f2 = make_func()
>>> f1 is f2
False
>>> f1.__code__ is f2.__code__
True
5
Dimitris Fasarakis Hilliard On

Just look at the bytecode for f with dis:

dis(f)
  2           0 BUILD_LIST               0
              3 LOAD_CONST               1 (<code object g at 0x7febd88411e0, file "<ipython-input-21-f2ef9ebb6765>", line 2>)
              6 LOAD_CONST               2 ('f.<locals>.g')
              9 MAKE_FUNCTION            1
             12 STORE_FAST               1 (g)

  6          15 SETUP_LOOP              46 (to 64)
             18 LOAD_GLOBAL              0 (range)
             21 LOAD_FAST                0 (x)
             24 LOAD_FAST                0 (x)
             27 LOAD_CONST               3 (2)
             30 BINARY_ADD
             31 CALL_FUNCTION            2 (2 positional, 0 keyword pair)
             34 GET_ITER
        >>   35 FOR_ITER                25 (to 63)
             38 STORE_FAST               2 (y)

(snipped for brevity)

The code object loaded for g:

3 LOAD_CONST               1 (<code object g at 0x7febd88411e0, file "<ipython-input-21-f2ef9ebb6765>", line 2>)

doesn't contain any mutable structures, it just contains the executable code and other immutable information. You could take a peek at it too:

dis(f.__code__.co_consts[1])
  3           0 LOAD_FAST                1 (a)
              3 LOAD_ATTR                0 (append)
              6 LOAD_FAST                0 (y)
              9 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             12 POP_TOP

  4          13 LOAD_FAST                1 (a)
             16 RETURN_VALUE

Every time f is called, MAKE_FUNCTION is called which re-creates the function from the byte code that already exists there.