Legate Pandas: Scaling the Python ecosystem

Mike McCarty

Prior knowledge:
Previous knowledge expected
Python, Data Science, Pandas

Summary

Legate Pandas is a distributed and accelerated drop-in replacement of Pandas. Legate Pandas enables high-performance, scalable execution of dataframe programs on multi-GPU systems by combining the Legion runtime with GPU accelerated dataframe kernels in cuDF. Legate Pandas targets dataframe programs with data processing requirements that cannot be fulfilled by a single GPU.

Description

  • What is Legate Pandas?
  • Why is this approach different?
  • Supported Features and Limitations
  • Performance benchmarks compared to Dask+cuDF and MPI+cuDF
  • What's next?