Thursday 10:00 AM–10:40 AM in Radio City (#6604)

End to End Data Science Without Leaving The GPU

Randy Zwitch

Audience level:
Intermediate

Description

Using JupyterLab, Ibis and the OmniSci (formerly MapD) kernel for Jupyter, OmniSci Senior Developer Advocate Randy Zwitch will show an end-to-end data science workflow using only the GPU. Users will understand how tools conforming to the Apache Arrow memory specification can pass zero-copy references to each other, avoiding costly data serialization and allowing users to work as if everything were a pandas data frame.

Abstract

As datasets get larger and algorithms more complex, GPUs are frequently employed to minimize time-to-completion. Users may be familiar with using GPUs for machine learning, but did you know that feature generation and even basic statistics can also be performed on the GPU?

Using JupyterLab, Ibis and the OmniSci kernel for Jupyter, OmniSci Senior Developer Advocate Randy Zwitch will show an end-to-end data science workflow using only the GPU. At the end of the talk, users will understand how tools conforming to the Apache Arrow memory specification can pass zero-copy references to each other, avoiding costly data serialization and allowing users to work as if everything were a pandas data frame.

Introduction (OmniSci, why/how of GPUs for analytics) - 5 minutes GPU Open Analytics Initiative (GOAI) - 5 minutes Data Science Live Demonstration - 20 minutes pymapd/Ibis with OmniSci backend for data munging Visualizing data with Vega/Vega-Lite on the GPU Simple ML example using H2o Q&A - 10 minutes or remaining

Subscribe to Receive PyData Updates