Monday 10:15 AM–11:00 AM in The Franklin Suite, 3rd Floor / Technical

Serverless for Data Scientists

Mike Lee Williams 🌴

Audience level:
Intermediate

Description

Working in the cloud means you don’t have to deal with hardware. The goal of "serverless" is to also avoid dealing with operating systems. It offers instances that run for the duration of a single function call. These instances have limitations, but a lot of what data scientists do is a perfect fit for this new world! We'll see how to train and deploy machine learning using this infrastructure.

Abstract

In this talk we'll first see the basic idea behind serverless and learn how to deploy a very simple web application to AWS Lambda using Zappa. We'll then look in detail at the "embarrassingly parallel" problems where serverless really shines for data scientists. In particular we'll take a look at PyWren, an ultra-lightweight alternative to heavy big data distributed systems such as Spark. We'll learn how PyWren uses AWS Lambda as its computational backend to churn through huge analytics tasks. PyWren opens up big data to mere mortal data scientists who don't have the budget or engineering support for a long-lived cluster. We'll finish up by using PyWren and Zappa to train and deploy a production machine learning model.

Subscribe to Receive PyData Updates

Subscribe