Tuesday 1:10 p.m.–1:45 p.m.

Towards a universal platform for data science on public and private clouds

Karim Chine

Audience level:
Intermediate

Description

ElasticR federates infrastructures and tools of data science within a highly interactive, responsive and programmable framework. It provides a new generation of virtual real-time collaborative and provenance-aware workbenches, notebooks, and scientific spreadsheets as well as tools for building Python, R, Julia, Scala and Sql-based interactive/collaborative web applications.

Abstract

Today, R and Python are everywhere, their capabilities are deployed in a broad spectrum of ways and help solving a plethora of data science problems for Academia and industry. Julia is growing in popularity thanks to its near-C performance. Scala is the langauge of choice for getting the best out of Spark. Huge amounts of data are stored in relational databases and Sql remains an unescable tool for data science. We have built and made accessible ElasticR, an on-line platform based on multi-personnality provenance-aware and real-time collaborative engines to make-it easy to work simultatneously with R, Python, Julia, Scala and Sql, to build analysis, services and applications combining what every language has best to offer.

ElasticR enables everyone to seamlessly provision and control those engines on any public or private cloud, any cluster, any cheap computing device and use them: From an R+Python+Julia+Scala+Sql workbench or notebook, running in the browser, accessible collaboratively like a Google document from anywhere and any device including tablets and mobile phones.

From an R or Python command lines, to build programmatically, in a fully reproducible manner data science infrastructures, services, workflows, applications, dashboards, etc.

To embed R+Python+Julia+Scala+Sql capabilities, programmatically or interactively, in Microsoft Office documents, spreadsheets or presentations.

To enable data science-centric social networking and to make the sharing of data science artificats, static and interactive, as easy as sharing a file on dropbox

To create interactive web applications backed by any combination of the five environments thanks to a new framework for reactive cross-language programming and a data bridge allowing variables to flow seamlessly between R, Python, Julia, Scala and Sql.

To distribute those web applications to the world or to specific users thanks to ElasticR' cloud tokens mechanism which combines cloud resources decription with data, models and interactivity specification.

To instrumentalize R+Python in the IoT realm and to create links between R+Python engines and the physical world.

The presentation will be an overview of the main concepts behind the platform's design and will include live demos of ElasticR on AWS, GCE, OpenStack and Raspberry Pis.

Sponsors


Become a sponsor.