This talk outlines and discusses how Intel is actively tackling the scalability and productivity aspects of Python in data science, from classic numerical problems to modern data analytic workflows. In particular, the talk will introduce a proposed solution called “High Performance Analytic Toolkit” designed to allow distributed computing within pandas and scikit-learn workflow.
Python, although widely used for prototyping, are not designed to scale to large problems. As a result, organizations typically have a dedicated team that takes the prototype created by research or data scientists, and rewrite the application in performance languages, for deployment at scale in production environments. This talk outlines and discusses how Intel is actively tackling the scalability and productivity aspects of Python in data science, from classic numerical problems to modern data analytic workflows. In particular, the talk will introduce a proposed solution called “High Performance Analytic Toolkit” designed to allow distributed computing within pandas and scikit-learn workflow.