Sunday 15:00–15:45 in Tower Suite 3

How good is your prediction? - Quantifying uncertainty in Machine Learning predictions

Maria Navarro

Audience level:
Intermediate

Description

It is common practice to test the performance of ML models, but it is not so common to test the reliability of the predictions. Training a model, test its performance and hoping that it will produce good quality predictions is not the right approach if we are concerned with reliable ML. Hence, in this talk, we will discuss the concept of conformal predictions which quantify quality in predictions.

Abstract

One of the disadvantages of ML, specifically when performing a regression task, is the lack of reasonable confidence intervals on a given prediction. For example, if we want to estimate the risk of cancer recurrence in a patient, it is crucial to know how close the prediction of the model is to the true value. In other words, we need to know how good the prediction is. There are various ad-hoc ways to measure this, such as using cross-validation to produce an average confidence interval, or generating confidence intervals using resampling methods to give a distribution of predictions. However, these methods require assumptions, either about the algorithm or about the data. Conformal prediction assumes very little about the outcome you are trying to forecast or the algorithm you are using to predict, but still produces reliable confidence intervals. The aim of this talk is to therefore give a gentle introduction to conformal prediction.

The conformal predictions framework is a recent development in ML [1] which associates a reliable measure of confidence with a prediction, suitable for any type of ML algorithm. In this talk, I will formalize the concepts of conformal predictions and the nonconformity measure, and we will see that the method is independent of the algorithm, requiring no additional assumptions. Then, I will show how to compute conformal intervals for both classification and regression problems. Moreover, I will discuss different nonconformity measures (see [2-3]), and show examples of how this choice of definition impacts a problem. Finally, I will show using Python how conformal predictions can be used in a real problem.

References:

[1] G. Shafer and V. Vovk, A tutorial on conformal prediction, Journal of Machine Research, 9, 371-421, 2008.

[2] H. Bostrom, H. Linusson, T. Lofstrom and U. Johansson, Accelerating difficulty estimation for conformal regression forests, Ann Math Artif Intell, 81, 125–144.

[3] V. Balasubramanian, S-S Ho and V. Vovk, Conformal prediction for reliable machine learning (Theory, adaptations and applications) Morgan Kaufman, 2014.

Subscribe to Receive PyData Updates

Subscribe