Saturday 17:00–17:45 in LG7

To explain or to predict?

Nick Sorros

Audience level:
Intermediate

Description

Data science and machine learning in particular has a preference towards the kind of statistical modelling that optimises for prediction rather than explanation. Understanding which variables correlate with the phenomenon you are investigating is an inherent part of both approaches but models with high explanatory power are not necessarily the ones that yield the best predictions.

Abstract

The goal of statistical modelling is to build a relationship between the observational data and the phenomenon under investigation. Predictive and explanatory modelling are sub-branches of statistical modelling that optimise for prediction or explaination respectively.

The data science and machine learning literature is filled with examples of predictive modelling approaches while other disciples like economists are almost solely reliant on explanatory methods.

In theory a model that explains your dataset well should come with high predictive capabilities but in reality it has been shown that this is not the case. This leads to a different set of optimisations and tradeoffs during modelling.

The goal of this talk is to discuss these different tradeoffs and showcase the similarities and differences between these approaches as well as to discuss when is best to use one over the other.