Conference Schedule

View past PyData event schedules here.

General Sessions — Monday Nov. 4, 2019

  Central Park West (6501) Central Park East (6501a) Winter Garden (5412) Belasco (6203) Broadway (5202)
8:00 AM

Breakfast & Registration

9:00 AM

Opening Remarks @ Central Park

9:15 AM A Few Good Public Servants: How Great Analysis Inspires Action Kelly Jin
10:05 AM Time series for scikit-learn people Ethan Rosenthal Managing Stakeholders: The Key to Successful Data Science for Business Lauren Oldja Bringing mental health data to doctors Bill Lynch Unconference YOU! AstroPy sprint Kelle Cruz
10:45 AM

Coffee Break

10:55 AM Pandas vs Koalas: The Ultimate Showdown! Amanda Moran A Crash Course in Applied Linear Algebra Patrick Landreman Working with Maps: Extracting Features for Traffic Crash Insights Jenny Turner-Trauring Unconference YOU! AstroPy sprint Kelle Cruz
11:40 AM Scalable Machine Learning with Dask Tom Augspurger Small Big Data: using NumPy and Pandas when your data doesn't fit in memory Itamar Turner-Trauring To comment or not Veronica Hanus
12:20 PM

Lunch - Sponsored by FactSet

1:20 PM Launching a new warehouse with SimPy at Rent the Runway Meghan Heintz Cleaning, optimizing and windowing pandas with numba Diego Torres Quintanilla Conda-press, or Reinventing the Wheel Anthony Scopatz 1:20pm-2pm: A Github for Data, 2pm-3:25pm: OPEN YOU! Discussion: What’s Missing in Python and Data Online Resources? Debra Williams Cauley
2:05 PM The physics of deep learning using tensor networks Marianne Hoogeveen Propensity Score Matching: A Non-experimental Approach to Causal Inference Michael Johns Effective Python and R collaboration Daniel Rodriguez New and Upcoming Sean Law, Deepyaman Datta, Joseph Kearney
2:50 PM Is Spark still relevant? Multi-node CPU and single-node GPU workloads with Spark, Dask and RAPIDS. Eric Dill Same API, Different Execution Saul Shanabrook Using Graph Nets (GNs) to predict molecular properties Chaya D Stern, Yuanqing Wang What's now in NumFOCUS projects? (Part 1) Ralf Gommers, Colin Carroll, Francesc Alted, Ryan Abernathey
3:30 PM


3:40 PM Quantifying uncertainty in machine learning models Samuel Rochette Generating realistic, differentially private data sets using GANs Joshua Falk Simplified Data Quality Monitoring of Dynamic Longitudinal Data: A Functional Programming Approach Jacqueline Gutman 3:40pm-5pm: Role Play annotation facilitator training, 5:10pm-5:50pm: STUMPY & Time-series analysis YOU! Panel Discussion: My First Open Source Contribution Bryan Cross, Chris Fonnesbeck, Julia Signell, Steve Dower
4:25 PM Spark Backend for Ibis: Seamless Transition Between Pandas and Spark Li Jin, Hyonjee Joo Build an AI-powered Pet Detector in Visual Studio Code Katherine Kampf Geo Experiments and CausalImpact in Incrementality Testing Jessica Tyler Panel Discussion: Pitching Open Source Up Your Management Chain Gil Forsyth, David Palaitis, Kevin Fleming, Mike McCarty
5:10 PM What we learned by running a large custom Bayesian forecasting model in production Jens Fredrik Skogstrom Semantic modeling of data science code Evan Patterson Zarr vs. HDF5 Joe Jevnik What's now in NumFOCUS projects? (Part 2) Hannah Aizenman, Anthony Scopatz, Bryan Van de Ven, Thomas J Fan
5:50 PM
6:30 PM

Blue Fire AI presents: Social Reception @ Datadog

8:30 PM

General Sessions — Tuesday Nov. 5, 2019

  Central Park West (6501) Central Park East (6501a) Winter Garden (5412) Music Box (5411) Ambassador (6202) Belasco (6203) Broadway (5202)
8:00 AM

Breakfast & Registration

9:15 AM Data science at The New York Times: a mission-driven approach to personalizing the customer journey Chris Wiggins, Anne Bauer
10:05 AM Deep Dive into scikit-learn's HistGradientBoosting Classifier and Regressor Thomas J Fan Julia for Pythonistas Kelly Shen Improve the efficiency of your Big Data application Francesc Alted, Christian Steiner The Echo-Chamber of Your Social Media Feed Tamar Yastrab Unconference - Using GPUs in Python/Julia/R Applications YOU! Fireside Chat Chris Wiggins conda-forge sprint Marius van Niekerk
10:45 AM

Coffee Break

10:55 AM Every ML Model Deserves To Be A Full Micro-service Romain Cledat Dealing With Imbalanced Classes in Machine Learning Aditya Lahiri Production Code in Data Science Consulting Akos Furton High-Performance Data Science at Scale with RAPIDS, Dask, and GPUs Keith Kraus Unconference - 10:55: Contributing to Open Source, 11:40: Data Science Education Steve Dower, Allen Downey Introduction to Bayesian Modeling with Stan: No Statistics Background Required (Pt 1) Breck Baldwin conda-forge sprint Marius van Niekerk, Christopher J "CJ" Wright
11:40 AM Clean Machine Learning Code: Practical Software Engineering Principles for ML Craftsmanship Moussa Taifi Ph.D. Discover your latent food graph with this 1 weird trick Alex Egg, Emily A Ray, Parin Choganwala Colorism in High Fashion (featuring: K-Means Clustering) Malaika Handa
12:20 PM

Lunch - Sponsored by FactSet

1:00 PM

Closing remarks @ Central Park West

1:20 PM Stars, Planets, and Python Sara Seager
2:10 PM tf-explain: Interpretability for Tensorflow 2.0 Raphaël Meudec Type-Driven Automated Learning with Lale Martin Hirzel Data-centric exploration using intake, dask, hvplot, datashader, panel, and binder Julia Signell Genetic algorithms: Making errors do all the work Raman Tehlan Fireside Chat Sara Seager Introduction to Bayesian Modeling with Stan: No Statistics Background Required (Pt 2) Breck Baldwin Unconference - 2:10: Software Engineering for Data Scientists, 2:55: Equity and Algorithmic Fairness YOU!
2:55 PM Building a maintainable plotting library Colin Carroll, Hannah Aizenman, Thomas Caswell Implementing Lightweight Random Indexing for Polylingual Text Classification Ian Whalen Should I develop my own DS library? Maybe. Piero Ferrante How and why to put your Jupyter notebooks in Docker containers Brian Austin Unconference - Native Extension Modules YOU!
3:35 PM


3:45 PM The Secret Life of Python Steve Dower Free Your Esoteric Data Using Apache Arrow and Python Maciej Wojton The Inspection Paradox is Everywhere Allen Downey Painting A Picture of Public Data Kamal Abdelrahman Unconference - 3:45: Design Thinking for Data Science, 4:30: Open YOU! Introduction to Bayesian Modeling with Stan: No Statistics Background Required (Pt 3) Breck Baldwin PyData Pop Quiz James Powell
4:30 PM Reproducibility in ML Systems: A Netflix Original Ferras Hamad Sloth & ENVy James Powell A How-to guide for migrating legacy data applications Marius van Niekerk, Rohit Kapur Building Software and Communities With Peer Review Noam Ross Stump the Chump Thomas Caswell, Anthony Scopatz, Paul Ganssle
5:15 PM

Lightning Talks @ Music Box

6:00 PM
7:00 PM

Invitation only: Speaker Social Reception sponsored by Two Sigma

9:00 PM

Tutorial Sessions — Wednesday Nov. 6, 2019

  Winter Garden (5412) Music Box (5411) Radio City (6604) Broadway (5202) Ambassador (6202) Belasco (6203)
8:00 AM

Curated Track: Statistics

Curated Track: NLP

8:00 AM

Breakfast & Registration

9:00 AM Introduction to pandas Marc Garcia, Jeff Reback, Tom Augspurger An Introduction to Probability and Statistics Will Kurt Introduction to NLP Mariel Frank Visualizing the 2019 Measles Outbreak in NYC (with Python) Carlos Afonso HoloViz and Matplotlib sprint Julia Signell Jupyter sprint Saul Shanabrook
10:30 AM

Coffee Break

10:45 AM Advanced Software Testing for Data Scientists Raoul-Gabriel Urma How to Prove You’re Right: A/B Testing with SciPy Hillary Green-Lerman, Michoel Snow Introduction to Language Modeling Aditi Khullar, Eugene Tang Swiftly turn Jupyter notebooks into pretty web apps Michal Mucha HoloViz and Matplotlib sprint Julia Signell Jupyter sprint Saul Shanabrook, Jason Grout
12:15 PM


1:15 PM Machine learning from scratch using the scientific Python stack Lara Kattan New Trends in Estimation and Inference Cameron Davidson-Pilon [SCHEDULE CHANGE 12:45PM - 2:15PM] Neural Networks for Natural Language Processing Matti Lyra From Raw Recruit Scripts to Perfect Python (in 90 minutes) Stanley van der Merwe, Petr Wolf Pandas sprint Jeff Reback Jupyter & matplotlib sprint Saul Shanabrook
2:45 PM


3:00 PM Hacking the Data Science Challenge Michoel Snow, Hillary Green-Lerman Bayesian Inference for Fun and Profit Mitzi Morris Role playing Annotation workshop Agata Sumowska, Bhargav Srinivasa Desikan, Laurence Warner, Lev Konstantinovskiy A Primer on Gaussian Processes for Regression Analysis Chris Fonnesbeck Pandas sprint Jeff Reback Jupyter & matplotlib sprint Saul Shanabrook
4:30 PM

Wrap Up

6:00 PM

Community Mixer with NYC Python @ Broadway

8:30 PM

Subscribe to Receive PyData Updates