Monday 1:20 PM–2:05 PM in Central Park East (6501a)

Cleaning, optimizing and windowing pandas with numba

Diego Torres Quintanilla

Audience level:
Experienced

Description

Pandas has accrued a sizable debt in flexibility and maintainability to deliver excellent performance. This talk will show how Pandas maintainers and Two Sigma are using Numba to pay off some of this debt in one of the gnarliest parts of the code: window operations. If merged to mainstream Pandas, this work will deduplicate code, make it easier to debug and make window operations extensible.

Abstract

This is an advanced talk, aimed at those interested in the internals of Pandas or using Numba to optimize Python code. It does not assume that the audience is already familiar with the Pandas codebase or window operations. However, it does assume that the audience has basic familiarity with different code optimization options, like Cython and Numba.

An ordered outline of the talk is as follows (first level bullets are sections of the talk, while second-level bullets are the main ideas I want to convey in that section).

Subscribe to Receive PyData Updates

Subscribe