Yes, you can replace an inner function, even if it is using a closure. You'll have to jump through a few hoops though. Please take into account:
You need to create the replacement function as a nested function too, to ensure that Python creates the same closure. If the original function has a closure over the names foo
and bar
, you need to define your replacement as a nested function with the same names closed over. More importantly, you need to use those names in the same order; closures are referenced by index.
Monkey patching is always fragile and can break with the implementation changing. This is no exception. Retest your monkey patch whenever you change versions of the patched library.
Code objects
To understand how this will work, I'll first explain how Python handles nested functions. Python uses code objects to produce function objects as needed. Each code object has an associated constants sequence, and the code objects for nested functions are stored in that sequence:
>>> def outerfunction(*args):
... def innerfunction(val):
... return someformat.format(val)
... someformat = 'Foo: {}'
... for arg in args:
... yield innerfunction(arg)
...
>>> outerfunction.__code__
<code object outerfunction at 0x105b27ab0, file "<stdin>", line 1>
>>> outerfunction.__code__.co_consts
(None, <code object innerfunction at 0x10f136ed0, file "<stdin>", line 2>, 'outerfunction.<locals>.innerfunction', 'Foo: {}')
The co_consts
sequence is an immutable object, a tuple, so we cannot just swap out the inner code object. I'll show later on how we'll produce a new function object with just that code object replaced.
How closures are handled
Next, we need to cover closures. At compile time, Python determines that
a) someformat
is not a local name in innerfunction
and that
b) it is closing over the same name in outerfunction
.
Python not only then generates the bytecode to produce the correct name lookups, the code objects for both the nested and the outer functions are annotated to record that someformat
is to be closed over:
>>> outerfunction.__code__.co_cellvars
('someformat',)
>>> outerfunction.__code__.co_consts[1].co_freevars
('someformat',)
You want to make sure that the replacement inner code object only ever lists those same names as free variables, and does so in the same order.
Closures are created at run-time; the byte-code to produce them is part of the outer function:
>>> import dis
>>> dis.dis(outerfunction)
2 0 LOAD_CLOSURE 0 (someformat)
2 BUILD_TUPLE 1
4 LOAD_CONST 1 (<code object innerfunction at 0x10f136ed0, file "<stdin>", line 2>)
6 LOAD_CONST 2 ('outerfunction.<locals>.innerfunction')
8 MAKE_FUNCTION 8 (closure)
10 STORE_FAST 1 (innerfunction)
# ... rest of disassembly omitted ...
The LOAD_CLOSURE
bytecode there creates a closure for the someformat
variable; Python creates as many closures as used by the function in the order they are first used in the inner function. This is an important fact to remember for later. The function itself looks up these closures by position:
>>> dis.dis(outerfunction.__code__.co_consts[1])
3 0 LOAD_DEREF 0 (someformat)
2 LOAD_METHOD 0 (format)
4 LOAD_FAST 0 (val)
6 CALL_METHOD 1
8 RETURN_VALUE
The LOAD_DEREF
opcode picked the closure at position 0
here to gain access to the someformat
closure.
In theory this also means you can use entirely different names for the closures in your inner function, but for debugging purposes it makes much more sense to stick to the same names. It also makes verifying that the replacement function will slot in properly easier, as you can just compare the co_freevars
tuples if you use the same names.
replace_inner_function()
Now for the swapping trick. Functions are objects like any other in Python, instances of a specific type. The type isn't exposed normally, but the type()
call still returns it. The same applies to code objects, and both types even have documentation:
>>> type(outerfunction)
<type 'function'>
>>> print(type(outerfunction).__doc__)
Create a function object.
code
a code object
globals
the globals dictionary
name
a string that overrides the name from the code object
argdefs
a tuple that specifies the default argument values
closure
a tuple that supplies the bindings for free variables
>>> type(outerfunction.__code__)
<type 'code'>
>>> print(type(outerfunction.__code__).__doc__)
code(argcount, posonlyargcount, kwonlyargcount, nlocals, stacksize,
flags, codestring, constants, names, varnames, filename, name,
firstlineno, lnotab[, freevars[, cellvars]])
Create a code object. Not for the faint of heart.
(The exact argument count and docstring varies between Python versions; Python 3.0 added the kwonlyargcount
argument, and as of Python 3.8, posonlyargcount has been added).
We'll use these type objects to produce a new code
object with updated constants, and then a new function object with updated code object; the following function is compatible with Python versions 2.7 through to 3.8.
def replace_inner_function(outer, new_inner):
"""Replace a nested function code object used by outer with new_inner
The replacement new_inner must use the same name and must at most use the
same closures as the original.
"""
if hasattr(new_inner, '__code__'):
# support both functions and code objects
new_inner = new_inner.__code__
# find original code object so we can validate the closures match
ocode = outer.__code__
function, code = type(outer), type(ocode)
iname = new_inner.co_name
orig_inner = next(
const for const in ocode.co_consts
if isinstance(const, code) and const.co_name == iname)
# you can ignore later closures, but since they are matched by position
# the new sequence must match the start of the old.
assert (orig_inner.co_freevars[:len(new_inner.co_freevars)] ==
new_inner.co_freevars), 'New closures must match originals'
# replace the code object for the inner function
new_consts = tuple(
new_inner if const is orig_inner else const
for const in outer.__code__.co_consts)
# create a new code object with the new constants
try:
# Python 3.8 added code.replace(), so much more convenient!
ncode = ocode.replace(co_consts=new_consts)
except AttributeError:
# older Python versions, argument counts vary so we need to check
# for specifics.
args = [
ocode.co_argcount, ocode.co_nlocals, ocode.co_stacksize,
ocode.co_flags, ocode.co_code,
new_consts, # replacing the constants
ocode.co_names, ocode.co_varnames, ocode.co_filename,
ocode.co_name, ocode.co_firstlineno, ocode.co_lnotab,
ocode.co_freevars, ocode.co_cellvars,
]
if hasattr(ocode, 'co_kwonlyargcount'):
# Python 3+, insert after co_argcount
args.insert(1, ocode.co_kwonlyargcount)
# Python 3.8 adds co_posonlyargcount, but also has code.replace(), used above
ncode = code(*args)
# and a new function object using the updated code object
return function(
ncode, outer.__globals__, outer.__name__,
outer.__defaults__, outer.__closure__
)
The above function validates that the new inner function (which can be passed in as either a code object or as a function) will indeed use the same closures as the original. It then creates new code and function objects to match the old outer
function object, but with the nested function (located by name) replaced with your monkey patch.
Let's try it out
To demonstrate that the above all works, lets replace innerfunction
with one that increments each formatted value by 2:
>>> def create_inner():
... someformat = None # the actual value doesn't matter
... def innerfunction(val):
... return someformat.format(val + 2)
... return innerfunction
...
>>> new_inner = create_inner()
The new inner function is created as a nested function too; this is important as it ensures that Python will use the correct bytecode to look up the someformat
closure. I used a return
statement to extract the function object, but you could also look at create_inner.__code__.co_consts
to grab the code object.
Now we can patch the original outer function, swapping out just the inner function:
>>> new_outer = replace_inner_function(outerfunction, new_inner)
>>> list(outerfunction(6, 7, 8))
['Foo: 6', 'Foo: 7', 'Foo: 8']
>>> list(new_outer(6, 7, 8))
['Foo: 8', 'Foo: 9', 'Foo: 10']
The original function echoed out the original values, but the new returned values incremented by 2.
You can even create new replacement inner functions that use fewer closures:
>>> def demo_outer():
... closure1 = 'foo'
... closure2 = 'bar'
... def demo_inner():
... print(closure1, closure2)
... demo_inner()
...
>>> def create_demo_inner():
... closure1 = None
... def demo_inner():
... print(closure1)
...
>>> replace_inner_function(demo_outer, create_demo_inner.__code__.co_consts[1])()
foo
In a nutshell
So, to complete the picture:
- Create your monkey-patch inner function as a nested function with the same closures in the same order.
- Use the above
replace_inner_function()
to produce a new outer function.
- Monkey patch the original outer function to use the new outer function produced in step 2.