Thursday 2:00 PM–2:40 PM in Room 1

Transforming Data to Unlock Its Latent Value

Tony Ojeda

Audience level:
Intermediate

Description

This talk will be about gaining an understanding of the real world entities represented by our data, creatively conceptualizing different perspectives from which our data can be analyzed, and then bringing those conceptualizations to life with the help of Python libraries such as Pandas, Scikit-Learn, Seaborn, and Yellowbrick so that we can unlock the latent value and insights hidden in our data.

Abstract

At the heart of data analysis, there lies a need to understand the real world entities being represented in the data. Every data set we encounter is an attempt to capture a slice of our complex world and communicate some information about it in a way that has potential to be informative to humans, machines, or both. Moving from basic analyses to advanced analytics requires the ability to imagine multiple ways of conceptualizing the composition of entities and the relationships present in our data. It also requires the realization that different levels of aggregation, disaggregation, and transformation can open up new pathways to understanding our data and identifying the valuable insights it contains.

In this talk, we’ll discuss several ways to think about the composition and representation of our data. We’ll also demonstrate a series of methods that leverage tools like networks, hierarchical aggregations, and unsupervised clustering to visually explore our data, transform it to discover new insights, help frame analytical problems and questions, and even improve machine learning model performance. In exploring these approaches, and with the help of Python libraries such as Pandas, Scikit-Learn, Seaborn, and Yellowbrick, we will provide a practical framework for thinking creatively and visually about your data and unlocking latent value and insights hidden deep beneath its surface.