Thursday 11:45 AM–12:20 PM in Main Room

Your data fits in RAM: how to avoid cluster computing

Aaron Richter

Audience level:
Intermediate

Description

When people hear "big data", they automatically assume a cluster of machines is required to analyze the data or build machine learning models. That may have been the case 5-10 years ago, but how times have changed! This talk shows how you can usually launch a single machine with enough RAM to handle your "big data" workloads.

Abstract

When people hear "big data", they automatically assume a cluster of machines is required to analyze the data or build machine learning models. That may have been the case 5-10 years ago, but how times have changed! This talk shows how you can usually launch a single machine with enough RAM to handle your "big data" workloads.

Cluster computing has its place, but is often overkill for data analysis and machine learning workloads. This talk will cover a number of tips and tricks for how to handle large datasets both locally and in the cloud. Case studies and code examples will be presented to show how easy it really is to analyze "big data" on simple infrastructures!

The audience is expected to have some familiarity with data analysis and machine learning in Python or R.

Subscribe to Receive PyData Updates

Subscribe

Tickets

Get Now