Dask is a pure python library for parallel and distributed computing. It's designed with flexibility in mind, making it easy to parallelize the complicated workflows often found in science. However, once you get something working, how do you debug or profile it? In this talk we'll cover the various tools Dask provides for diagnosing bugs and bottlenecks, as well as tips for resolving these issues.
Dask is a pure python library for parallel and distributed computing. It's designed with simplicity and flexibility in mind, making it easy to parallelize the complicated workflows often found in science. However, once you get something working, how do you debug or profile it? Debugging and profiling parallel code is notoriously hard! In this talk we'll cover the various tools Dask provides for diagnosing bugs and performance bottlenecks, as well as tips and techniques for resolving these issues.
Starting with an example single-threaded probram, we'll walk through adding Dask to parallelize it, and then iterate on this example to gradually improve performance throughout the talk. Attendees should leave having a better understanding of: