This tutorial is a hands on introduction to pandas. pandas is a fast, powerful, flexible and easy to use data analysis and manipulation tool, built on top of the Python programming language.
This tutorial is aimed at people new to pandas. Knowledge of Python will be useful but is not essential. JupyterLab will be used, and attendees are expected to complete short exercises during the tutorial. The tutorial will start by the basic pandas features, but it will go deep into some of the pandas implementation details, so attendees are prepared to deal with real data. We will cover the next topics:
Loading data from different formats
Transforming data and data types
Joining datasets and reshaping
Filtering and selecting
Dealing with missing values
Aggregating data and computing statistics
Data visualization