This tutorial aims in presenting introductory material for those who had never programmed any anomaly detection/novelty detection. We will start from describing what an anomaly is and present differences between anomaly detection and other machine learning approaches. Main part will consist of density based anomaly detection for single and multidimensional cases.
The plan is to combine theory with practice and divide our tutorial into 2 parts: 1.Theory: Anomaly Detection vs Other Machine Learning approaches. a) What exactly is an anomaly b) Density Methods for: I. One-dimensional case II. Multivariate case with independent features III. Multivariate case with dependencies c) Estimator based methods
2.Hands-on: a) Density-based method based on Gauss (non-Gaussian) distribution: Implementation for dependent and independent multidimensional dataset b) Estimator-based method: Implementation for multidimensional dataset c) Comparison with other anomaly detectors implemented in scikit-learn.
During hands-on we will work on Python 2.7. Other requirements: Scikit learn 0.18. or higher NumPy 1.0 or higher Pandas The best way to go is to install newest Anaconda distribution and you are good to go.