You know that, under the hood, your favorite Python modelling libraries are using numerical methods for optimization. Now increase the efficiency of your models by deepening your understanding of scientific Python (SciPy and NumPy). We'll review sparse matrices, matrix decomposition, gradient descent & the Fourier Transform, plus write an algorithm from scratch using only NumPy/SciPy!
Building your first few scikit-learn models is gratifying, but where do you go from there? Gaining a deeper understanding of the numerical methods underlying your favorite modeling library is important for advancing in your data science career as it allows you to make more informed decisions about efficiency and run-time. Dive deep into the innards of the scientific Python stack (SciPy and NumPy) in a way relevant for data science, statistics and related numerical fields.
This tutorial will be have two parts: we'll spend 45 minutes reviewing numerical solution methods by hand, then dedicate 45 minutes to re-writing a popular machine learning algorithm from scratch using only NumPy and SciPy. In particular, we'll explore matrix decompositions for feature extraction and NLP, including topic modeling, plus gradient descent and the Fast Fourier Transform (FFT). We'll end by using NumPy and SciPy to code up PCA/LSA and gradient descent by hand! This should give you the confidence to dive deeper into the code base for Python machine learning libraries like SKLearn and give you the knowledge to start contributing to the development of machine learning open source Python software.