Wednesday 10:00 AM–10:40 AM in Radio City (#6604)

Scaling-Out Data Analytics in Python

Oleksandr Pavlyk

Audience level:
Novice

Description

This talk outlines and discusses how Intel is actively tackling the scalability and productivity aspects of Python in data science, from classic numerical problems to modern data analytic workflows. In particular, the talk will introduce a proposed solution called “High Performance Analytic Toolkit” designed to allow distributed computing within pandas and scikit-learn workflow.

Abstract

Python, although widely used for prototyping, are not designed to scale to large problems. As a result, organizations typically have a dedicated team that takes the prototype created by research or data scientists, and rewrite the application in performance languages, for deployment at scale in production environments. This talk outlines and discusses how Intel is actively tackling the scalability and productivity aspects of Python in data science, from classic numerical problems to modern data analytic workflows. In particular, the talk will introduce a proposed solution called “High Performance Analytic Toolkit” designed to allow distributed computing within pandas and scikit-learn workflow.

Subscribe to Receive PyData Updates

Subscribe