Thursday 3:50 PM–4:35 PM in Track 2 - Kodiak

Make it Work, Make it Right, Make it Fast - Debugging and Profiling in Dask

Jim Crist

Audience level:
Novice

Description

Dask is a pure python library for parallel and distributed computing. It's designed with flexibility in mind, making it easy to parallelize the complicated workflows often found in science. However, once you get something working, how do you debug or profile it? In this talk we'll cover the various tools Dask provides for diagnosing bugs and bottlenecks, as well as tips for resolving these issues.

Abstract

Dask is a pure python library for parallel and distributed computing. It's designed with simplicity and flexibility in mind, making it easy to parallelize the complicated workflows often found in science. However, once you get something working, how do you debug or profile it? Debugging and profiling parallel code is notoriously hard! In this talk we'll cover the various tools Dask provides for diagnosing bugs and performance bottlenecks, as well as tips and techniques for resolving these issues.

Starting with an example single-threaded probram, we'll walk through adding Dask to parallelize it, and then iterate on this example to gradually improve performance throughout the talk. Attendees should leave having a better understanding of:

Subscribe to Receive PyData Updates

Subscribe

Tickets

Get Now