Thursday October 28 9:00 AM – Thursday October 28 9:30 AM in Talks II

Is my data drifting? Early monitoring for machine learning models in production.

Emeli Dral

Prior knowledge:
No previous knowledge expected

Summary

Machine learning models can degrade with time. This is often due to the change in input data or real-world patterns. It is critical to monitor the model performance in production. But it is not always possible to evaluate the model quality if the ground truth labels are not available. In this talk, we will present how one can monitor data and prediction drift as a proxy for performance decay.

Description

Machine learning models can degrade with time. Often, this is due to the change in input data and/or the relationship between the features and the target. It is important to keep an eye on model relevance and intervene in time if something goes wrong.

But it is not always possible to directly evaluate the model quality in production since you don't always have the ground truth labels or actual values. In this case, detecting a change in the input data distributions and model predictions might serve as an early warning of the expected model decay.

In this talk, we will explore how one can evaluate data drift using statistical tests, visualize it and interpret the results.