Saturday 15:45–16:20 in Megatorium

Online machine learning with creme

Max Halford

Audience level:
Intermediate

Description

Machine learning is often presented as a batch problem whereas in practice data is usually arriving in a sequential manner. Online machine learning is an elegant approach where a model can be trained on a stream of data. This is ideal for deploying machine learning pipelines. In this walk we'll see how to implement an online machine learning system wit a new open source library creme.

Abstract

Machine learning is often presented as a batch problem, where typically you "fit" a model to a training set and then "predict" on a test set. However, this approach turns out to be too rigid in practice mostly because in real life data is usually arriving in a sequential manner. Online machine learning is a different approach where the model can be trained on a stream of data. This has many benefits and only a few downsides in comparison with batch machine learning.

In this talk, we'll go over the differences between batch and online machine learning. We'll go through the pros and cons and see how online learning is, in my opinion, by far a much better approach for putting machine learning pipelines into production. The presentation will contain a mix of concepts and practical examples. We will see how to implement an online machine learning system with an up and coming open source library called creme.

The audience will, for the most part, leave the talk with some fresh ideas that they can easily start applying practice. The talk should be very valuable for practicing data scientists and data engineers. Moreover if you're into Bayesian statistics there will be a nice surprise at the end of the presentation!

Subscribe to Receive PyData Updates

Subscribe