Sunday 10:00–10:35 in Megatorium

Improving Machine Learning Workflow - Training, Packaging and Serving your Models

Wilder Rodrigues

Audience level:
Intermediate

Description

As machine learning practitioners, we know how hard it can be to have a smooth process around training and serving production-ready models. Processing the data, saving all the relevant artefacts to make experiments reproducible, packaging and serving the models; all these individual components can be a nightmare to implement and manage. MLflow - a new platform for managing the ML life cycle.

Abstract

As machine learning practitioners, we know how hard it can be to have a smooth process around training and serving production-ready models. Processing the data, saving all the relevant artefacts to make experiments reproducible, packaging and serving the models; all these individual components can be a nightmare to implement and manage. MLflow - an amazing new platform for managing the ML life cycle - comes to the rescue.

In this talk, we will present a Docker powered infrastructure that combines MLflow, JupyterHub and Minio (S3 compliant storage) that aims to solve the above problems and improve your machine learning workflow. In addition, we will present a CI-CD pipeline which is responsible for fetching production-ready models from storage, and building and publishing Docker images that serve these models in production. With this in place, tasks like experimenting, releasing and serving models become more straightforward and less manual. We will explore how this infrastructure can speed up our work, make it less error prone, and help us manage all ML related artefacts better.

We will start the talk by presenting the infrastructure and its components and how they address practitioners’ pain points. Next, we will show how our solution helps to train models in a structured way. And lastly, we will demonstrate how to automate packaging and serving of the models prior to deployment

Subscribe to Receive PyData Updates

Subscribe