Your data is too big to fit in memory—loading it crashes your program—but it's also too small for a complex Big Data cluster. How to process your data simply and quickly?
In this talk you'll learn the basic techniques for dealing with Small Big Data: money, compression, batching and parallelization, and indexing. In particular, you'll learn how to apply these techniques to NumPy and Pandas.
Your data is big enough that loading it into memory crashes your program, but small enough that setting up a Big Data cluster isn't worth the trouble. You're dealing with Small Big Data, and in this talk you'll learn the basic techniques used to process data that doesn't fit in memory.
First, you can just buy—or rent—more RAM. Sometimes that isn't sufficient or possible, in which case you can also:
You'll also learn how to apply these techniques to NumPy:
As well as Pandas: