PyData London 2018 - Presentation: Unsupervised Anomaly Detection with Isolation Forest

Unsupervised Anomaly Detection with Isolation Forest

Audience level:

Intermediate

Description

This talk will focus on the importance of correctly defining an anomaly when conducting anomaly detection using unsupervised machine learning. It will include a review of Isolation Forest algorithm (Liu et al. 2008), and a demonstration of how this algorithm can be applied to transaction monitoring, specifically to detect money laundering.

Abstract

Anomaly detection using machine learning has applications in many fields, including fraud detection. Automated transaction monitoring is another area where automatic anomaly detection is being used, specifically in fighting financial crimes like money laundering. This talk will briefly review the most common unsupervised anomaly detection methods, and will focus on the Isolation Forest algorithm (Liu et al. 2008)..

Perhaps the most important step towards successfully detecting money laundering is to recognise that often a transaction can be described as anomalous only under a certain set of factors. Such factors, being non-obvious, are revealed with the help of considerable subject matter expertise. Knowing revealing factors allows one to use the right attributes, but still leverage unsupervised learning.

This talk will cover:

Why detecting money laundering is different from other anomaly detection problems (and how it further varies by the banking type).
Brief review of unsupervised learning models for anomaly detection.
Description of Isolation Forest algorithm.
Short overview of its implementation in scikit-learn.
Walk-through a Python workbook with Isolation Forest algorithm applied to an anomaly detection task.

Sunday 11:00–11:45 in Tower Suite 3

Unsupervised Anomaly Detection with Isolation Forest

Elena Sharova

Description

Abstract

Subscribe to Receive PyData Updates