Wednesday 1:30 PM–3:00 PM in Music Box 5411/Winter Garden 5412 (5th fl)

Pandas: .head() to .tail()

Tom Augspurger

Audience level:
Novice

Description

An introduction to using pandas and related libraries for data analysis. We'll start by covering the core components of pandas including data structures, indexing, reshaping, and grouped operations. We'll then apply these methods along with other PyData libraries like seaborn and statsmodels to work through a pair of data analysis tasks.

Abstract

The tutorial will focus on solving common problems in data analysis by writing clean, readable, efficient code. Pandas will be the primary tool, though integrations with other libraries like scikit-learn, statsmodels, and matplotlib will be demonstrated. The emphasis will be on gradually learning methods for massaging data into the correct form through real applications, rather than an exhaustive walk-through of pandas' API.

This tutorial is aimed at beginner and intermediate PyData users. Attendees will hopefully have some experience with NumPy. The basics of NumPy and its relationship to pandas will briefly be covered. The core of the tutorial covers

After covering those core operations, we'll cover exploratory data analysis using pandas and Seaborn, and time series analysis with statsmodels.

Subscribe to Receive PyData Updates

Subscribe