Pandas has accrued a sizable debt in flexibility and maintainability to deliver excellent performance. This talk will show how Pandas maintainers and Two Sigma are using Numba to pay off some of this debt in one of the gnarliest parts of the code: window operations. If merged to mainstream Pandas, this work will deduplicate code, make it easier to debug and make window operations extensible.
This is an advanced talk, aimed at those interested in the internals of Pandas or using Numba to optimize Python code. It does not assume that the audience is already familiar with the Pandas codebase or window operations. However, it does assume that the audience has basic familiarity with different code optimization options, like Cython and Numba.
An ordered outline of the talk is as follows (first level bullets are sections of the talk, while second-level bullets are the main ideas I want to convey in that section).
- (2 minutes) Speaker intro, legal disclaimer (required by Two Sigma)
- (5 minutes) Discuss how Pandas has achieved its impressive use case coverage by focusing on commonly used features.
- Trade-off was made: optimization at the cost of flexibility.
- (5 minutes) Introduce window operations via examples.
- Explain what a rolling average is.
- Explain what rolling variance is (very similar to rolling average).
- Explain a more complex window operation: exponentially-weighted rolling average.
- Can pick-and-choose operations and windows to generate many more, like an exponentially-weighted moving variance.
- (5 minutes) Explain how all of the aforementioned operations are implemented in Pandas, using Cython.
- Even though windows and aggregations can conceptually be combined, they do not share code.
- Show slides that show Pandas' internal code, to demonstrate how there is little shared code.
- Although Pandas code is optimized with Cython, user-defined functions (functions passed to DataFrame.apply) are slow because they written in normal Python.
- (10 minutes) What if we used Numba instead of Cython in the backend?
- User-defined functions could be just-in-time (JIT) compiled
- Solution: implement window operations as two components, 'aggregators' and 'kernels'.
- 'Aggregators' know how to obtain a specific window, 'kernels' know how to apply a mathematical operation to a window.
- Numba JIT-ted classes for aggregators and kernels.
- Can now mix the 'rolling' aggregator with the 'mean' kernel to implement rolling averages!
- Can JIT a user-defined function to create a kernel.
- (5 minutes) Benefits
- Numba code is easier to debug
- Easier to maintain: componentized operations consolidate multiple implementations
- Existing test suite and API can be preserved
- (8 minutes) Q&A