PyData Berlin 2018 - Presentation: Manifold Learning and Dimensionality Reduction for Data Visualization and Feature Engineering

Dimensionality Reduction methods like PCA - Principal Component Analysis - are widely used in Machine Learning for a variety of tasks. But besides the well-known standard methods there are a lot more tools available, especially in the context of Manifold Learning. We will interactively explore these tools and present applications for Data Visualization and Feature Engineering using scikit-learn.

Slides

https://de.slideshare.net/StefanKhn4/talk-at-pydata-berlin-about-manifold-learning-and-applications

Jupyter Notebooks

https://github.com/cc-skuehn/Manifold_Learning

At the end of the slide deck (or in the Readme of the github repo) there are some links to interesting resources, e.g the respective parts of the scikit-learn documentation.

Outline

Dimensionality reduction techniques are not only useful for denoising purposes or making the data better accessible, they are also very important for any meaningful Exploratory Data Analysis or EDA, especially with respect to Data Visualization. Manifold Learning subsumes a collection of advanced methods from the field of Unsupervised Learning that allow us to capture different aspects of the given high-dimensional data in a low-dimensional manifold. Hereby, each method tries to preserve an important quantity - distances between points, variance, statistical or distributional properties. The variety of these different prespectives onto the data is not only helpful for EDA and Dataviz but also offers some new and interesting options when it comes to Feature Engineering and the ultimate task of Machine Learning and AI - "Learning from Data".

Introduction to Dimensionality Reduction and Manifold Learning
Important properties of manifolds and visualizations
Manifold Learning in scikit-learn
Worked examples - Live demo covering Locally Linear Embedding, Isomap, Multi-Dimensional Scaling, Random Projections and many more
Advanced topics - A deeper look at t-SNE
Improving your Feature set

Saturday 14:15–15:00 in Kursraum 3

Manifold Learning and Dimensionality Reduction for Data Visualization and Feature Engineering

Stefan Kühn

Description

Abstract

Slides

Jupyter Notebooks

Outline

Subscribe to Receive PyData Updates