Sunday 11:45–12:30 in Hörsaal 3

pyGAM: balancing interpretability and predictive power using Generalized Additive Models in Python

Dani Servén Marín

Audience level:
Intermediate

Description

With nonlinear models it is difficult to find a balance between predictive power and interpretability. How does feature A affect the output y? How will the model extrapolate? Generalized Additive Models are flexible and interpretable, with great implementations in R, but few options in the Python universe. pyGAM is a new open source library that offers to fill this gap.

Abstract

When we seek a predictive model that can easily be reasoned about, we are faced with a problem. Linear models are very intuitive but can suffer from high bias, while more flexible models are typically black boxes that cannot be decomposed.

Generalized Additive Models (GAMs) address this problem by making predictions from the sum of several feature functions. Since the overall model is additive, it is easy to diagnose the contribution of each feature to the final prediction, while holding all other predictors constant.

The individual feature functions are built using penalized splines, which allows us to automatically model non-linear relationships without having to manually try out many different transformations on each variable. And, we can penalize excessive wiggliness of the feature functions via a single smoothing parameter.

The result is a flexible model, where it is easy to incorporate prior knowledge and control overfitting. However, there exist few options for fitting GAMs in Python.

pyGAM is a new open-source library that offers to fill this gap. In this talk I will cover how to train GAMs with pyGAM, show how to inspect each feature's contribution to the prediction, illustrate how to automatically choose the right amount of smoothing, give some intuitions about the mathematics of GAMs, and show some of the cool things you can do with GAMs.

Subscribe to Receive PyData Updates