Friday 13:45–15:15 in GoDataDriven

Python Disfluencies

James Powell

Audience level:
Novice

Description

As a language, Python tends to be very forgiving of "bad code"; in fact, one of Python's core strenghts is that one can hack together something pretty terrible, and it'll mostly work and contribute to getting "real work" done. That said, when a codebase must be maintained and extended over a long time-horizon, maintainability concerns demand a better approach.

This tutorial is a live-coding exercise, covering common patterns seen when reviewing and refactoring code written by non-professional programmers, such as data scientists. It frames these as "disfluencies": code where the meaning is otherwise clear, where some goal is otherwise accomplished, but where the approach taken may be clumsy, prone to misunderstanding, or otherwise difficult to rely upon.

Abstract

Data scientists need to know how to code. Even beyond cleaning and munging exercises, proper data analysis quickly moves beyond simple pandas.DataFrame operations, and demands for reusability and long-term reusability require data scientists to write "real" code. This tutorial is a live-coding exercise, covering common "disfluencies" found when reviewing and refactoring code written by non-professional programmers.

Subscribe to Receive PyData Updates

Subscribe