As a language, Python tends to be very forgiving of "bad code"; in fact, one of Python's core strenghts is that one can hack together something pretty terrible, and it'll mostly work and contribute to getting "real work" done. That said, when a codebase must be maintained and extended over a long time-horizon, maintainability concerns demand a better approach.
This tutorial is a live-coding exercise, covering common patterns seen when reviewing and refactoring code written by non-professional programmers, such as data scientists. It frames these as "disfluencies": code where the meaning is otherwise clear, where some goal is otherwise accomplished, but where the approach taken may be clumsy, prone to misunderstanding, or otherwise difficult to rely upon.
Data scientists need to know how to code. Even beyond cleaning and
munging exercises, proper data analysis quickly moves beyond simple
pandas.DataFrame
operations, and demands for reusability and long-term
reusability require data scientists to write "real" code. This tutorial is a
live-coding exercise, covering common "disfluencies" found when reviewing and refactoring code written by non-professional programmers.