Sunday 11:00–11:45 in LG6

Probabilistic Programming in Python with PyMC3

Thomas Wiecki

Audience level:
Intermediate

Description

Probabilistic programming is a new paradigm that greatly increases the number of people who can successfully build statistical models and machine learning algorithms, and makes experts radically more effective. This talk will provide an overview of PyMC3, a new probabilistic programming package for Python featuring intuitive syntax and next-generation sampling algorithms.

Abstract

Machine learning is the driving force behind many recent revolutions in data science. Comprehensive libraries provide the data scientist with many turnkey algorithms that have very weak assumptions on the actual distribution of the data being modeled. While this blackbox property makes machine learning algorithms applicable to a wide range of problems, it also limits the amount of insight that can be gained by applying them.

The field of statistics on the other hand often approaches problems individually and hand-tailors statistical models to specific problems. To perform inference on these models, however, is often mathematically very challenging, and thus requires time-deriving equations as well as simplifying assumptions (like the normality assumption) to make inference mathematically tractable.

Probabilistic programming is a new programming paradigm that provides the best of both worlds and revolutionizes the field of machine learning. Recent methodological advances in sampling algorithms like Markov Chain Monte Carlo (MCMC), as well as huge increases in processing power, allow for almost complete automation of the inference process. Probabilistic programming thus greatly increases the number of people who can successfully build statistical models and machine learning algorithms, and makes experts radically more effective. Data scientists can create complex generative Bayesian models tailored to the structure of the data and specific problem at hand, but without the burden of mathematical tractability or limitations due to mathematical simplifications.

This talk will provide an overview of PyMC3, a new probabilistic programming package for Python featuring intuitive syntax and next-generation sampling algorithms.