Sunday 14:10–14:45 in Auditorium

“Why Should I Trust You?” - Debugging black-box text classifiers

Tobias Sterbak

Audience level:


Classifying text is a common use case for machine learning algorithms. But despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction. We will use eli5 and the LIME algorithm to explain text classifiers.


Understanding predictions of text classifiers is a crucial step to debug natural language processing pipelines and to gain trust in the predictions. Both are directly impacted by how much the human understands a model’s behavior. But even in linear models, hundreds or thousands of features can contribute significantly to a prediction. It is not reasonable to expect any user to comprehend why the prediction was made, even if individual weights can be inspected. Currently, models are evaluated using accuracy metrics on an available validation dataset. However, real-world data is often significantly different, and further, the evaluation metric may not be indicative of the product’s goal. Inspecting individual predictions and their explanations is a worthwhile solution, in addition to such metrics. In this case, it is important to aid users by suggesting which instances to inspect, especially for large datasets

I give an introduction to Local Interpretable Model-agnostic Explanations (LIME) by Ribeiro et al., 2016, a explanation technique that explains the predictions of any classifier in an interpretable and faithful manner by learning an interpretable model locally around the prediction. Then I show how to apply LIME with eli5 python library to understand predictions of text classifying pipelines.

Subscribe to Receive PyData Updates