Friday 11:15–12:45 in GoDataDriven

How to easily set up and version your Machine Learning pipelines, using DVC and MLV-tools

Stéphanie Bracaloni, Sarah Diot-Girard

Audience level:
Intermediate

Description

Have you ever heard about Machine Learning versioning solutions? Have you ever tried one of them? And what about automation? Come with us and learn how to easily build versionable pipelines! This tutorial explain through small exercises how to setup a project using DVC and MLV-tools.

Abstract

You're a data scientist. You have a bunch of analyses you performed in Jupyter Notebooks, but anything older than 2 months is totally useless because it's never working right when you open the notebook again. Also, you cannot remember the dropout rate on the second to last layer of this convolutional neural network which gave really great results 2 weeks ago and that you now want to deploy into production. Does that ring a bell?

You're a software engineer in a data science team. You can’t live without Git. Reviews on readable files, tests, code analysis, CI, used to belong to your daily basis. You were thinking of Jupyter Notebooks only as a demo tool. You need reproducibility for every step of your work even if you lose a server. And last but not least, you want to be able to deliver to production something usable by anyone.

What

This tutorial explains and shows how to use MLV-tools to set up a development environment and to be able to deliver the project avoiding frustrations due to teams segregation or point of view.

There is no magical solution, but compromises can be found. MLV-tools helps to:

Requirements

Outline

Global goal: be able to easily set up your own project using MLV-tools

Attendees will be guide step by step to experiment on their own computer.

1 - Introduction

Goal: expose versioning, automation and reproducibility issues with Machine Learning projects.

2 - What is DVC and how it works ?

Goal: understand how to handle code, hyperparameters and data versioning using Git an DVC pipeline.

3 - Handle a ML project with DVC and MLV-tools

Goal: easily use DVC on a Machine Learning project with MLV-tools.

4 - Going further

Goal: see how the process fits daily basis cases

Subscribe to Receive PyData Updates

Subscribe