Thursday 15:35–16:05 in Track 2

Can one do better than XGBoost? Presenting 2 new gradient boosting libraries - LightGBM and Catboost

Mateusz Susik

Audience level:
Intermediate

Description

We will present two recent contestants to the XGBoost library: LightGBM (released October 2016) and CatBoost (open-sourced July 2017). The participant will learn the theoretical and practical differences between these libraries. Finally, we will describe how we use gradient boosting libraries at McKinsey & Company.

Abstract

Gradient boosting proved to be a very effective method for classification and regression in the last years. A lot of successful business applications and data science contest solutions were developed around the XGBoost library. It seemed that XGBoost will dominate the field for many years.

Recently, two major players have released their own implementation of the algorithm. The first - LightGBM - comes from Microsoft. Its major advantages are lower memory usage and faster training speed.

The second - Catboost - was implemented by Yandex. Here, the approach was different. The aim of the library was to improve on top of the state-of-the-art gradient boosting algorithm performance in terms of accuracy.

During the talk, the participants will learn about the differences in the algorithm designs, APIs and performances.

Subscribe to Receive PyData Updates

Subscribe

Tickets

Get Now