Gradient boosting decision trees (GBDT) is a powerful machine-learning technique known for its high predictive power with heterogeneous data. In this talk, we will explore scikit-learn's implementation of histogram-based GBDT called HistGradientBoostingClassifier/Regressor and how it compares to other GBDT libraries such as XGBoost, CatBoost, and LightGBM.
Gradient boosting decision trees (GBDT) is a powerful machine-learning technique known for its high predictive power with heterogeneous data. In scikit-learn 0.21, we released our own implementation of histogram-based GBDT called HistGradientBoostingClassifier and HistGradientBoostingRegressor. This implementation is based on Microsoft's LightGBM and makes use of OpenMP for parallelization. In this talk, we will:
scikit-learn's histogram-based gradient boosting algorithm.HistGradientBoostingClassifier/Regressor's hyper-parameters.scikit-learn's implementation with other GBDT libraries such as XGBoost, CatBoost, and LightGBM.This talk is targeted to those familiar with machine learning and want a deeper understanding of scikit-learn's histogram-based gradient boosting trees.
The materials for this talk can be found at github.com/thomasjpfan/pydata-2019-histgradientboosting.