Sunday 10:15 AM–11:00 AM in Track 1

Introducing Dask-Gateway: Dask clusters as a service

Jim Crist

Audience level:
Novice

Description

Dask-Gateway provides a secure, multi-tenant server for managing Dask clusters. It allows users to launch and use Dask clusters in a shared, centrally managed environment, and supports a wide variety of backends (e.g. Kubernetes, Hadoop, HPC systems, etc…). In this talk we'll discuss the use and design of Dask-Gateway, as well as some of the issues we encountered while developing this tool.

Abstract

Dask has become a standard tool for parallelizing computational Python work, scaling from laptops to distributed clusters. Its compatibility with a wide variety of computing environments has been a major strength, allowing users to easily deploy on everything from Kubernetes to traditional HPC systems. New backends can be added by implementing a standard cluster interface, which then plays well with the rest of the Dask ecosystem.

While this design has served us well, there are a few pain points that have come up when using Dask at larger institutions. Some of these issues could be remedied by changes to the existing deployment design, but many of them required something new. We believe that something is Dask-Gateway.

Dask-Gateway is:

In this talk we'll discuss the use and design of Dask-Gateway, as well as some of the issues we encountered while developing this tool.

Subscribe to Receive PyData Updates

Subscribe