Wednesday Oct. 7, 2020, 10 a.m.–Oct. 7, 2020, 11 a.m. in Online

From notebook to production using flask and heroku

Koen Vossen

Audience level:
Novice

Description

Learn from a developers perspective how to go from a notebook to production. In this talk we will provide an overview of the steps required to go from code in a notebook to an application running in production.

Abstract

Joe Mulberry is a data scientist at FC Nordsjælland, a professional Danisch football team. He created a cool algorithm to find similar situations in football matches based on tracking data. This algorithm worked fine on his laptop. But it had some issues. First it was very slow. It even crashed at some moments because of large memory usages. Second it required a lot of hardcoded configuration. And last: it only ran on his laptop.

Together we took a look at the notebook code and refactored the code into more common layers: an api layer, infrastructure layer, service layer and the domain models. This gave us the ability to optimize isolated parts. Steps we will look into:

  1. Replace pandas with plain python objects for more readability
  2. Replace KDE with NN search with munkres compiled package to allow real-time querying
  3. Use the infra/service layer to do easy command line testing
  4. Replace local reading of files with S3 loader
  5. Build a simple API with Flask on top of the infra/service layer
  6. Deploy to heroku free dyno
  7. Add a React application on top of it to visualise the results

When the application was done, we were able to easily extract loading/parsing parts of the application and build an open-source library out if it: kloppy. The library allows everyone to skip the entire loading/parsing part of football data and start right away with algorithms.

During this talk we will show all steps we went through during this process.

Subscribe to Receive PyData Updates

Subscribe