Sunday 1:30 PM–2:15 PM in Production DS, Modeling - Auditorium

Accelerating Data Science with RAPIDS

Keith Kraus,

Audience level:
Intermediate

Description

Data science demands the interactive exploration of large volumes of data, combined with computationally intensive algorithms and analytics. Today, the computational limits of CPUs are being realized, and a new approach is needed. We will discuss how the GPU Open Analytics Initiative is breaking the compute barrier with GPU-accelerated libraries such as PyGDF and accelerating data science.

Abstract

  1. Challenges in Data Science today
  2. Technology interoperability
  3. Compute limitations
  4. Apache Aarow
  5. GPUs for compute (CPUs are the bottleneck)
    1. Deep learning
    2. Machine learning
    3. Data analytics
  6. The GPU Open Analytics Initiative (GoAI)
  7. The GPU Data Frame (GDF)
  8. Python library for GDF (PyGDF)
    1. Performance
    2. API
    3. Tips and tricks
  9. Scaling out to multi-GPU and mult-node via Dask GDF
    1. Performance
    2. API
    3. Tips and tricks
  10. CUDA array interface
    1. Numba + CuPy example
    2. PyTorch work in progress
  11. Future work
  12. GPU data frame
    1. Planned features
    2. Planned optimizations
  13. Machine learning
  14. Graph analytics
  15. Questions and answers

Subscribe to Receive PyData Updates

Subscribe