Wednesday 1:15 PM–2:45 PM in Winter Garden (5412)

Machine learning from scratch using the scientific Python stack

Lara Kattan

Audience level:
Intermediate

Description

You know that, under the hood, your favorite Python modelling libraries are using numerical methods for optimization. Now increase the efficiency of your models by deepening your understanding of scientific Python (SciPy and NumPy). We'll review sparse matrices, matrix decomposition, gradient descent & the Fourier Transform, plus write an algorithm from scratch using only NumPy/SciPy!

Abstract

Level-up your data science by diving deep into the innards of scientific Python

Building your first few scikit-learn models is gratifying, but where do you go from there? Gaining a deeper understanding of the numerical methods underlying your favorite modeling library is important for advancing in your data science career as it allows you to make more informed decisions about efficiency and run-time. Dive deep into the innards of the scientific Python stack (SciPy and NumPy) in a way relevant for data science, statistics and related numerical fields.

Tutorial structure: review math, write two algorithms from scratch

This tutorial will be have two parts: we'll spend 45 minutes reviewing numerical solution methods by hand, then dedicate 45 minutes to re-writing a popular machine learning algorithm from scratch using only NumPy and SciPy. In particular, we'll explore matrix decompositions for feature extraction and NLP, including topic modeling, plus gradient descent and the Fast Fourier Transform (FFT). We'll end by using NumPy and SciPy to code up PCA/LSA and gradient descent by hand! This should give you the confidence to dive deeper into the code base for Python machine learning libraries like SKLearn and give you the knowledge to start contributing to the development of machine learning open source Python software.

After this tutorial, you will:

Subscribe to Receive PyData Updates

Subscribe