We've seen how easy it is to modify CPython—it's a C program after all, and one written to be understood and extended by normal human beings. But beyond the slow, deliberative process of adding features to the core language, what is the value of hacking your interpreter?
This talk presents a practical example of dynamic code instrumentation using hacked interpreters.
At PyData Berlin 2015, we saw the
**kwarg(h!!) problem. Libraries with convenience layers, like matplotlib and pandas, often end up with function signatures for their convenience functions that give little guidance to users how to use them.
A convenience function with signature
f(*args, **kwargs) is written in this fashion to avoid introducing dependencies on downstack functions it calls. Unfortunately, from a documentation perspective, this provides the user with very little guidance on what the acceptable inputs for f might be. If the documentation authors follow the call-stack from
f (assuming there is a singular, fixed, static call graph,) then they may reintroduce this dependency in the form of documentation.
This is the
**kwarg(h!!) problem. We can use a hacked interpreter to explore a possible solution: dynamic code instrumentation. With a hacked CPython interpreter, we can instrument function calls in such a fashion as to automatically build an instrumented call graph given a test suite or suite of examples. From this instrumentation, we can derive useful artefacts to improve understanding of the library and to drive better documentation.