Gradient boosting decision trees (GBDT) is a powerful machine-learning technique known for its high predictive power with heterogeneous data. In this talk, we will explore scikit-learn's implementation of histogram-based GBDT called HistGradientBoostingClassifier/Regressor and how it compares to other GBDT libraries such as XGBoost, CatBoost, and LightGBM.
Gradient boosting decision trees (GBDT) is a powerful machine-learning technique known for its high predictive power with heterogeneous data. In scikit-learn
0.21, we released our own implementation of histogram-based GBDT called HistGradientBoostingClassifier
and HistGradientBoostingRegressor
. This implementation is based on Microsoft's LightGBM and makes use of OpenMP for parallelization. In this talk, we will:
scikit-learn
's histogram-based gradient boosting algorithm.HistGradientBoostingClassifier/Regressor
's hyper-parameters.scikit-learn
's implementation with other GBDT libraries such as XGBoost, CatBoost, and LightGBM.This talk is targeted to those familiar with machine learning and want a deeper understanding of scikit-learn
's histogram-based gradient boosting trees.
The materials for this talk can be found at github.com/thomasjpfan/pydata-2019-histgradientboosting.