Sunday 11:00–11:45 in Tower Suite 2

A Primer (or Refresher) On Linear Algebra for Data Science

Ruben van de Geer

Audience level:
Intermediate

Description

Virtually all machine learning models and technologies used in data science rely "under the hood" on linear algebra. In this talk, I will discuss the most important results from linear algebra that are relevent for data science, which I will illustrate with coded examples and geometrical interpretations.

Abstract

Virtually all machine learning models and technologies used in data science rely "under the hood" on linear algebra. This holds true for the machine learning work horses, such as linear regression, principal component analysis, and collaborative filtering, as well as for state-of-the-art deep learning models. Needless to say, a thorough understanding of linear algebra is by no means required to work as a data scientist---after all, who needs linear algebra if you have .fit() and .predict()? However, it sure helps if you know the fundamentals of linear algebra, for example, to interpret and analyze algorithms or to speed up your Python code.

To this end, I will discuss the most important results from linear algebra for data science. More precisely, I will start with the very basics (vectors, matrices, matrix arithmetics, matrix/vector norms) and provide geometrical interpretations and code examples for each of those topics. Then, we will consider the geometrical interpretation to ordinary least squares regression. Finally, I will discuss what matrix factorization is, what types of matrix factorizations exist, and why it is useful for you.

Subscribe to Receive PyData Updates

Subscribe