Bayesian learning is a principled approach to dealing with ever-present problems in machine learning with neural networks, such as overfitting and uncertainty estimation. In this talk I provide an overview of Bayesian inference and Bayesian learning with neural networks, as well as examples of Bayesian learning in practice using a Python-based probabilistic programming language, PyMC3.
In conventional formulations of supervised machine learning using deep neural network (DNN) models, the weight parameters of a DNN are trained on a collection of labeled data inputs, and new labels are predicted by propagating novel inputs through the DNN using these trained weight values. From a Bayesian perspective, the model itself is uncertain and thus the object of primary interest is a conditional distribution of network parameterizations given the training data; in this context the conventional approach produces MAP estimates of conditional statistics for data labels. This has the advantage of being potentially far less computationally complex than a fully Bayesian approach, as inference is reduced to a minimization problem over the parameters of the model. However, these networks tend to be prone to overfitting, and uncertainty estimation and regularization require the use of validation methods on top of training.
Using a Bayesian approach, label predictions and uncertainty estimates which are independent of particular model parameterizations or details of the regularization procedure are given automatically and in a principled way. While this makes Bayesian learning more intellectually sound and satisfying, it fell out of favor for some time due to its comparative complexity as a high-dimensional inference problem. However, more recent theoretical advances in variational and Monte Carlo methods for statistical inference, development of practical algorithms for approximate Bayesian inference, increasingly cheap and accessible computational power, and the development of specialized software packages and probabilistic programming languages for statistical inference, has meant Bayesian learning is enjoying a resurgence in popularity.
In this talk, I give an overview of Bayesian inference and Bayesian learning with DNNs, then discuss these advances and how they make Bayesian learning a viable approach for modern machine learning implementations. Examples of Bayesian learning in practice are shown using a Python-based probabilistic programming language, PyMC3.