Presentation: hydra-zen: Configurable, Reproducible, and Scalable Computing with Hydra

Time Zone

Saturday October 30 9:00 PM – Saturday October 30 9:30 PM in Talks I

hydra-zen: Configurable, Reproducible, and Scalable Computing with Hydra

Ryan Soklaski

Prior knowledge:: Previous knowledge expected
Writing code (in Python) to run analyses, experiments, or simulations

Summary

This talk encourages researchers in STEM fields to design their code-based projects to be configurable, reproducible, and scalable. We introduce hydra-zen, an open-source library that extends Hydra to simplify and standardize the design process. hydra-zen can be used to configure and orchestrate complex applications, like machine learning experiments, without creating a glut of boilerplate code.

Description

This talk is for Pythonistas in STEM fields who write code to run experiments, simulations, and analyses. These applications often have many configurable components (settings, hyperparameters, etc.), and their results need to be recorded with the associated configuration that produced those results. We present a simple and standardized way to configure such projects, leveraging two open-source Python libraries: Hydra and hydra-zen.

(minutes 0-5) We encourage researchers to design their code-based projects to be configurable, reproducible, and scalable. We argue that this will not only benefit their productivity, but that it will also have a significant impact on the "shelf life" of their project.

(minutes 5-10) Designing one's project to be configurable, reproducible, and scalable, without assistive libraries, is a time-consuming process that produces large amounts of boilerplate code. We highlight the steep design challenges and pitfalls that obstruct this process.

(minutes 10-20) We introduce the audience to Hydra and hydra-zen. Hydra is a framework for configuring and orchestrating applications, and hydra-zen provides powerful tools for dynamically and automatically generating configurations for your code. Together, these libraries can be used to elegantly configure, run, and organize results for complex applications, like machine learning experiments.

(minutes 20-25) We demonstrate the various bells and whistles of Hydra and hydra-zen. These include:

Generating rich configurations for your code "on the fly".
Checking your configurations using runtime validation.
Leveraging hydra_zen.typing for enriched, static context about your project's configurations.
Auto-generating a commandline interface from your application's configuration.
Launching multiple runs of your application (in parallel!).
Performing sweeps over configurations (e.g. hyperparameter searches).

(minutes 25-30) Q&A