Saturday 12:45–13:30 in Expert

Feature Importance and Ensemble Methods : a new perspective

Constant Bridon

Audience level:
Experienced

Description

Ensemble methods are extremely performant in terms of prediction, but lack easy interpretation. Feature importance is not only counting up how many times a feature has been used in a weak learner, but also by how much this feature contributes to the result. Detailed example and implementation are provided in a jupyter notebook in python for the library "xgboost" of extreme gradient boosting.

Abstract

I - Feature importance in ensemble algorithms - state of the art

1) Feature importance in sklearn/xgboost : basically counts the occurrences of a feature in all the weak learners 2) Construction of the trees in xgboost : if the trees are deep enough, every feature is going to be used 3) Global feature importance is a misleading : a given feature might be critical for a given subpopulation but completely irrelevant for another (ex : multi-class classification)

II - Xgboost real feature importance

1) Prediction influence : first splits influence the prediction more than last splits, so the importance of a feature must be weighted by the discrimination it provides

2) Point-to-point feature importance : following the path of a given prediction, it is possible to weigh the importance of every used feature

3) A relevant assessment of feature importance : explanation of a given prediction, and aggregation on a set of data points

III - Implementation and examples

1) Point-to-point feature importance illustration and implementation explanation

2) Evolution of feature importance with respect to learning iterations

3) Noisy variables cancellation

IV - Limits and ways forward

1) A word on correlated variables

2) Is there a compromise performance/interpretation ?

Subscribe to Receive PyData Updates

Subscribe

Tickets

Get Now