pomegranate: fast and flexible probabilistic modeling in python

Audience level:
Intermediate

Description

This tutorial will introduce the python package pomegranate for flexible probabilistic modeling. I will begin with an overview of its core models and features as well as a comparison to other packages. Then we will have a hands-on demonstration of using Bayes classifiers and mixtures in ways that no other package allows, including stacking the two to create a Gaussian mixture Bayes classifiers.

Abstract

In this tutorial I will give a high level overview of pomegranate, a python package for flexible probabilistic modeling. I will first introduce the general models that are supported, including simple probability distributions, Bayes classifiers, mixture models, hidden Markov models, and Bayesian networks. I will then delve into the design decisions behind pomegranate and how that enables a great deal more flexibility than other packages at no additional computational cost. Lastly, we will have a hands on tutorial where we see how one would use pomegranate to simply create complicated yet intuitive models that no other package allows you to create.

One of the core tenants of pomegranate is that even seemingly complicated models, such as Bayesian networks, can still be treated as a simple probability distributions, and that frequently the mathematics behind a probabilistic model can be decoupled from the specific probability distribution used. This typically has very real word consequences. For example, when analyzing signal data, one may use a normal distribution to model the mean signal but finds that the duration of the signal is exponentially distributed. Using the appropriate distributions, whether in a simple naive Bayes classifier, or a more complicated multivariate hidden Markov model, can frequently create more accurate models without changing the data that the model is fit to.

Another core design decision is that all models are fit using additive summary statistics. Essentially, a batch of data can be reduced to a small set of numbers from which exact updates can be derived. These numbers can be added together between batches to calculate exact updates as if they were calculated on the full data set. This naturally enabled a great deal of features, such as out-of-core learning for massive data sets that don't fit in memory, a strategy for multi-threaded fitting, mini-batch updates, and even semi-supervised learning. These features, and more, will be described in the talk while also emphasizing how little work the user needs to do in order to specify complicated training strategies.

Next, we will go into a hands-on demonstration of using Bayes classifiers and mixture models in ways that other packages do not support. This includes modeling each feature as a separate distribution, and stacking simpler models to create more complicated ones. In particular, we will create a Gaussian mixture Bayes classifier, which is similar to the widely known Gaussian naive Bayes model except that it uses mixture models instead of individual normal distributions to model more complicated data.