Conference Schedule

NOTE: There is a social event on Friday at Kosmos after the last Keynote.

View past PyData event schedules here.

Main Sessions — Friday May 20, 2016

  Hall 1 Hall 4 Hall 5 Hall 7

Registration & Breakfast


Opening Notes


Keynote: Olivier Grisel

10:45 Track 1 Frontera: open source, large scale web crawling framework Alexander Sibiryakov Removing Soft Shadows with Hard Data Maciej Gryka Data Wrangling with Python Katharine Jarmul
11:30 Functional Programming in Python Daniel Kirsch (CANCELLED) The Simple Leads To The Spectacular Angelos Kapsimanis Setting up predictive analytics services with Palladium Andreas Lattner
12:15 One in a billion: finding matching images in very large corpora Ryan Henderson Machine Learning at Scale Nathan Epstein Spotting trends and tailoring recommendations: PySpark on Big Data in fashion Martina Pugliese


14:00 Using small data in the client instead of big data in the cloud Anton Dubrau What every data scientist should know about data anonymization Katharina Rasch Visualizing Andrej Warkentin Practical Word2vec in Gensim Lev Konstantinovskiy
14:45 Dealing with TBytes of Data in Realtime Nils Magnus Accelerating Python Analytics by In-Database Processing Edouard Fouché Track 3


15:45 Classifying Search Queries without User Click Data Abhishek Thakur Python and TouchDesigner for Interactive Experiments Jessica Palmer Plumbing in Python: Pipelines for Data Science Applications Thomas Reineking Single-source Python 2/3 Mike Müller
16:30 BigchainDB : a Scalable Blockchain Database, in Python Trent McConaghy Let's play Space Invaders! Maciej Jaskowski Bayesian Optimization and it's application to Neural Networks Moritz Neeb

Lightning Talks


Closing Notes




Keynote: Julia Evans


Main Sessions — Saturday May 21, 2016

  Hall 1 Hall 4 Hall 5 Hall 7



Keynote: Wes McKinney

10:30 What's new in Deep Learning? Kashif Rasul Introduction to Julia for Python programmers David Higgins Python based predictive analytics with GraphLab Create Danny Bickson Using Spark -- With PySpark Dr. Frank Gerhardt, Bence Zambo
11:15 Holy D@t*! How to Deal with Imperfect, Unclean Datasets Katharine Jarmul Robot uses toddler-like self exploration for the development of body representations Guertel Idai A full Machine learning pipeline in Scikit-learn vs in scala-Spark: pros and cons Jose Quesada
12:00 The IceCube data pipeline from the South Pole to publication Jakob van Santen Usable A/B testing – A Bayesian approach Nora Neumann Connecting Keywords to Knowledge Base Using Search Keywords and Wikidata Fang Xu


13:45 Statistically Speculating on the Source of Sneezes and Sniffles Ian Ozsvald pypet: A Python Toolkit for Simulations and Numerical Experiments Robert Meyer ExpAn - A Python library for advanced statistical analysis of A/B tests Jie Bao Which city is the cultural capital of Europe? An introduction to Apache PySpark for GeoAnalytics Shoaib Burq, Kashif Rasul
14:30 Designing spaCy: A high-performance natural language processing (NLP) library written in Cython Matthew Honnibal Zero-Administration Data Pipelines using AWS Simple Workflow Anne Matthies Predicting political views from text Felix Biessmann
15:15 Data Integration in the World of Microservices Valentine Gogichashvili PySpark in Practice Ronert Obst, Dat Tran Estimating stock price correlations using Wikipedia Delia Rusu


16:15 Bridging the gap: from Data Science to service Daniel Moisset Visualizing research data: Challenges of combining different datasources Juha Suomalainen Brand recognition in real-life photos using deep learning Lukasz Czarnecki Track 4
17:00 TBA James Powell Smart Banking - Real Time Driven Christian Rebernik, Arnab Dutta Building a polyglot Data Science Platform on Big Data systems Frank Kaufer Chain, Loop & Group: How Celery Empowered our Data Scientists to Take Control of our Data Pipeline Michelle Tran

Closing Notes