Wednesday 3:00 PM–4:30 PM in Winter Garden (5412)

Hacking the Data Science Challenge

Michoel Snow, Hillary Green-Lerman

Audience level:


You’ve passed the first round of interviews and are now given a data science take home challenge. How can you analyze the data, demonstrate your data science abilities, and tick the required checkboxes, all within the allotted time? This tutorial will take you through the process of working through a data science challenge using pre-built functions to automate the boring stuff.


Coding a good data science model for work is not the same as programming a good data science challenge for an interview. This tutorial will cover what you need to know in order to make your challenge stand out from the crowd as well as highlighting common mistakes and pitfalls to avoid. We will be stressing best practices more than specific machine learning techniques. This tutorial assumes you have a working knowledge of Pandas and at least one plotting library.

During the tutorial you will be split up into groups to work through different parts of the data science takehome challenge. The code for the tutorial can be found at

The focus for this tutorial will be following topics:

Working with trick data
Outlining your Challenge
Exploratory Data Analysis (EDA)
Modeling data

If time permits we will also be discussing the following general best practices:

Subscribe to Receive PyData Updates