When people hear "big data", they automatically assume a cluster of machines is required to analyze the data or build machine learning models. That may have been the case 5-10 years ago, but how times have changed! This talk shows how you can usually launch a single machine with enough RAM to handle your "big data" workloads.
When people hear "big data", they automatically assume a cluster of machines is required to analyze the data or build machine learning models. That may have been the case 5-10 years ago, but how times have changed! This talk shows how you can usually launch a single machine with enough RAM to handle your "big data" workloads.
Cluster computing has its place, but is often overkill for data analysis and machine learning workloads. This talk will cover a number of tips and tricks for how to handle large datasets both locally and in the cloud. Case studies and code examples will be presented to show how easy it really is to analyze "big data" on simple infrastructures!
The audience is expected to have some familiarity with data analysis and machine learning in Python or R.