Those of us who use TensorFlow often focus on building the model that's most predictive, not the one that's most deployable. So how to put that hard work to work? In this talk, we'll walk through a strategy for taking your machine learning models from Jupyter Notebook into production and beyond.
A large part of the data science workflow consists of feature engineering, model selection, hyperparameter optimization and other highly analytical tasks. However, what happens when you want to enable others to access your findings without having to replicate your entire workflow? While technologies like TensorFlow and Jupyter enable the creation of increasingly complex models and the sharing of results, the existing open source tooling for deploying these models in the real world is much less extensive. This talk will discuss one way of leveraging some of the most loved open source frameworks to achieve the goal of a reliable and scalable machine learning deployment pipeline.
TensorFlow Serving, a recently released project by Google, provides a set of tools that enables practitioners the tooling necessary to package a TensorFlow model so that it can be deployed, monitored and updated, providing all that is necessary to manage the full lifecycle of a machine learning product. We will see how by pairing this technology with Flask and an API server built around it, we can create a single artifact that external users can easily access while providing internal stakeholders full control. Further we will detail how Docker can seamlessly integrate with the rest of the system for added reproducibility guarantees and ease of deployment.
The talk will go into the inner workings of such a system, including its architecture and performance characteristics. The end goal is giving the audience a practical set of steps towards achieving a solution to some of the painful aspects of dealing with machine learning projects.