Deep learning models now emerge in multiple domains. The question data scientists and users always ask is "Why does it work?". Explaining decisions from neural networks is vital for model improvements and analysis, and users' adoption. In this talk, I will explain interpretability methods implementations with TF2.0 and introduce tf-explain, a TF2.0 library for interpretability.
We will explore some research papers on interpretability of neural networks, at different scale: from the ultra-specific with analysis of convolutional filters to more user-friendly input visualizations.
For each method, I'll provide some theoretical explanations (what mathematical operations we are performing), and a Tensorflow 2 implementation to examine in details how to proceed.
Finally, we will go through tf-explain usage, from offline model inspection to training monitoring.