Always wondering why your models are not winning any Kaggle competitions? Randomly guessing model parameters might work for some, but not everybody gets that lucky! In this talk, we'll look at a way of optimising machine learning models, than random search.. We'll build up some intuition about how Bayesian optimisation with Gaussian processes works, and how we can implement it using scikit-learn.
Choosing the right parameters for a machine learning model is almost more of an art than a science. Proper model selection plays a huge part in the performance of a machine learning model. It is remarkable then, that the industry standard algorithm for selecting model parameters, is something as simple as random search.
Random search works well when we can sample the validation loss cheaply. However, in some settings it can sometimes take on the order of hours, or maybe even days, to get a single sample from the validation loss. In those cases, it feels very wasteful to not guide our search by previous samples.
Bayesian optimisation uses Gaussian processes to model the loss function. Using the Bayesian paradigm, we can then update our model, when we have computed the validation loss for a new set of parameters. I will talk you through the inner workings of the algorithm, so that we can better understand how it explores the parameter space.
But how do we use this algorithm in our day to day work? I'll briefly show you how you can implement this algorithm on top of scikit-learn, and how you can apply it to tune a machine learning model. I'll also discuss some common gotcha's and other things to watch out for.