Speakers

Name	Presentation(s)
Aditi Khullar	Introduction to Language Modeling
Aditya Lahiri	Dealing With Imbalanced Classes in Machine Learning
Agata Sumowska	Role playing Annotation workshop
Akos Furton	Production Code in Data Science Consulting
Alex Egg	Discover your latent food graph with this 1 weird trick
Allen Downey	The Inspection Paradox is Everywhere, Unconference - 10:55: Contributing to Open Source, 11:40: Data Science Education
Amanda Moran	Pandas vs Koalas: The Ultimate Showdown!
Anne Bauer	Data science at The New York Times: a mission-driven approach to personalizing the customer journey, Fireside Chat
Anthony Scopatz	Conda-press, or Reinventing the Wheel, What's now in NumFOCUS projects? (Part 2), Stump the Chump
Bhargav Srinivasa Desikan	Role playing Annotation workshop
Bhargav Srinivasa Desikan	Role playing Annotation workshop
Bill Lynch	Bringing mental health data to doctors
Breck Baldwin	Introduction to Bayesian Modeling with Stan: No Statistics Background Required (Pt 1), Introduction to Bayesian Modeling with Stan: No Statistics Background Required (Pt 2), Introduction to Bayesian Modeling with Stan: No Statistics Background Required (Pt 3)
Brian Austin	How and why to put your Jupyter notebooks in Docker containers
Bryan Cross	Panel Discussion: My First Open Source Contribution
Bryan Van de Ven	What's now in NumFOCUS projects? (Part 2)
Cameron Davidson-Pilon	New Trends in Estimation and Inference
Carlos Afonso	Visualizing the 2019 Measles Outbreak in NYC (with Python)
Chaya D Stern	Using Graph Nets (GNs) to predict molecular properties
Chris Fonnesbeck	A Primer on Gaussian Processes for Regression Analysis, Panel Discussion: My First Open Source Contribution
Christian Steiner	Improve the efficiency of your Big Data application
Christopher J "CJ" Wright	conda-forge sprint, conda-forge sprint
Chris Wiggins	Data science at The New York Times: a mission-driven approach to personalizing the customer journey, Fireside Chat
Colin Carroll	Building a maintainable plotting library, What's now in NumFOCUS projects? (Part 1)
Daniel Rodriguez	Effective Python and R collaboration
Dante Gama Dessavre	High-Performance Data Science at Scale with RAPIDS, Dask, and GPUs
David Palaitis	Panel Discussion: Pitching Open Source Up Your Management Chain
Debra Williams Cauley	Discussion: What’s Missing in Python and Data Online Resources?
Deepyaman Datta	New and Upcoming
Diego Torres Quintanilla	Cleaning, optimizing and windowing pandas with numba
Emily A Ray	Discover your latent food graph with this 1 weird trick
Eric Dill	Is Spark still relevant? Multi-node CPU and single-node GPU workloads with Spark, Dask and RAPIDS.
Ethan Rosenthal	Time series for scikit-learn people
Eugene Tang	Introduction to Language Modeling
Evan Patterson	Semantic modeling of data science code
Ferras Hamad	Reproducibility in ML Systems: A Netflix Original
Francesc Alted	Improve the efficiency of your Big Data application, What's now in NumFOCUS projects? (Part 1)
Gil Forsyth	Panel Discussion: Pitching Open Source Up Your Management Chain
Hannah Aizenman	What's now in NumFOCUS projects? (Part 2), Matplotlib sprint, Matplotlib sprint, Panel Discussion: My First Open Source Contribution, Building a maintainable plotting library, HoloViz and Matplotlib sprint, HoloViz and Matplotlib sprint, Jupyter & matplotlib sprint, Jupyter & matplotlib sprint
Hillary Green-Lerman	How to Prove You’re Right: A/B Testing with SciPy, Hacking the Data Science Challenge
Hyonjee Joo	Spark Backend for Ibis: Seamless Transition Between Pandas and Spark
Ian Whalen	Implementing Lightweight Random Indexing for Polylingual Text Classification
Itamar Turner-Trauring	Small Big Data: using NumPy and Pandas when your data doesn't fit in memory
Jacqueline Gutman	Simplified Data Quality Monitoring of Dynamic Longitudinal Data: A Functional Programming Approach
James Powell	PyData Pop Quiz, Sloth & ENVy
Jason Grout	Jupyter sprint, Jupyter sprint, Jupyter & matplotlib sprint, Jupyter & matplotlib sprint
Jeff Reback	Pandas sprint, Pandas sprint, Introduction to pandas
Jenny Turner-Trauring	Working with Maps: Extracting Features for Traffic Crash Insights
Jens Fredrik Skogstrom	What we learned by running a large custom Bayesian forecasting model in production
Jessica Tyler	Geo Experiments and CausalImpact in Incrementality Testing
Jim Weiss	Example
Joe Jevnik	Zarr vs. HDF5
Joseph Kearney	New and Upcoming
Joshua Falk	Generating realistic, differentially private data sets using GANs
Julia Signell	Data-centric exploration using intake, dask, hvplot, datashader, panel, and binder, HoloViz and Matplotlib sprint, HoloViz and Matplotlib sprint, Panel Discussion: My First Open Source Contribution, HoloViz and Matplotlib sprint
Kamal Abdelrahman	Painting A Picture of Public Data
Katherine Kampf	Build an AI-powered Pet Detector in Visual Studio Code
Keith Kraus	High-Performance Data Science at Scale with RAPIDS, Dask, and GPUs
Kelle Cruz	AstroPy sprint, AstroPy sprint
Kelly Jin	A Few Good Public Servants: How Great Analysis Inspires Action, Fireside Chat
Kelly Shen	Julia for Pythonistas
Kevin Fleming	Panel Discussion: Pitching Open Source Up Your Management Chain
Lara Kattan	Machine learning from scratch using the scientific Python stack
Laurence Warner	Role playing Annotation workshop
Lauren Oldja	Managing Stakeholders: The Key to Successful Data Science for Business
Lev Konstantinovskiy	Role playing Annotation workshop
Li Jin	Spark Backend for Ibis: Seamless Transition Between Pandas and Spark
Maciej Wojton	Free Your Esoteric Data Using Apache Arrow and Python
Malaika Handa	Colorism in High Fashion (featuring: K-Means Clustering)
Marc Garcia	Introduction to pandas
Marianne Hoogeveen	The physics of deep learning using tensor networks
Mariel Frank	Introduction to NLP
Marius van Niekerk	conda-forge sprint, conda-forge sprint, A How-to guide for migrating legacy data applications
Martin Hirzel	Type-Driven Automated Learning with Lale
Matti Lyra	[SCHEDULE CHANGE 12:45PM - 2:15PM] Neural Networks for Natural Language Processing
Meghan Heintz	Launching a new warehouse with SimPy at Rent the Runway
Michael Johns	Propensity Score Matching: A Non-experimental Approach to Causal Inference
Michael Skarlinski	New and Upcoming
Michal Mucha	Swiftly turn Jupyter notebooks into pretty web apps
Michoel Snow	Hacking the Data Science Challenge, How to Prove You’re Right: A/B Testing with SciPy
Mike McCarty	Panel Discussion: Pitching Open Source Up Your Management Chain
Mitzi Morris	Bayesian Inference for Fun and Profit
Moussa Taifi Ph.D.	Clean Machine Learning Code: Practical Software Engineering Principles for ML Craftsmanship
Natu Lauchande	Machine Learning Engineering principles with Python and MLFlow
Nick Becker	High-Performance Data Science at Scale with RAPIDS, Dask, and GPUs
Noam Ross	Building Software and Communities With Peer Review
Parin Choganwala	Discover your latent food graph with this 1 weird trick
Patrick Landreman	A Crash Course in Applied Linear Algebra
Paul Ganssle	Stump the Chump
Petr Wolf	From Raw Recruit Scripts to Perfect Python (in 90 minutes)
Piero Ferrante	Should I develop my own DS library? Maybe.
Ralf Gommers	What's now in NumFOCUS projects? (Part 1)
Raman Tehlan	Genetic algorithms: Making errors do all the work
Raoul-Gabriel Urma	Advanced Software Testing for Data Scientists
Raphaël Meudec	tf-explain: Interpretability for Tensorflow 2.0
Rohit Kapur	A How-to guide for migrating legacy data applications
Romain Cledat	Every ML Model Deserves To Be A Full Micro-service
Ryan Abernathey	What's now in NumFOCUS projects? (Part 1)
Samuel Rochette	Quantifying uncertainty in machine learning models
Sara Seager	Stars, Planets, and Python, Fireside Chat
Saul Shanabrook	Same API, Different Execution, Jupyter sprint, Jupyter sprint, Jupyter & matplotlib sprint, Jupyter & matplotlib sprint
Sean Law	New and Upcoming, New and Upcoming
Stanley van der Merwe	From Raw Recruit Scripts to Perfect Python (in 90 minutes)
Steve Dower	Unconference - 10:55: Contributing to Open Source, 11:40: Data Science Education, The Secret Life of Python, Panel Discussion: My First Open Source Contribution
Tamar Yastrab	The Echo-Chamber of Your Social Media Feed
Thomas Caswell	Stump the Chump, Building a maintainable plotting library, HoloViz and Matplotlib sprint, HoloViz and Matplotlib sprint, Jupyter & matplotlib sprint, Jupyter & matplotlib sprint, Matplotlib sprint, Matplotlib sprint
Thomas J Fan	Deep Dive into scikit-learn's HistGradientBoosting Classifier and Regressor, What's now in NumFOCUS projects? (Part 2)
Tom Augspurger	Scalable Machine Learning with Dask, What's now in NumFOCUS projects? (Part 1), Introduction to pandas
Veronica Hanus	To comment or not
Will Kurt	An Introduction to Probability and Statistics
YOU!	Unconference, Unconference, 1:20pm-2pm: A Github for Data, 2pm-3:25pm: OPEN, 3:40pm-5pm: Role Play annotation facilitator training, 5:10pm-5:50pm: STUMPY & Time-series analysis, Unconference - Using GPUs in Python/Julia/R Applications, Unconference - 2:10: Software Engineering for Data Scientists, 2:55: Equity and Algorithmic Fairness, Unconference - Native Extension Modules, Unconference - 3:45: Design Thinking for Data Science, 4:30: Open
Yuanqing Wang	Using Graph Nets (GNs) to predict molecular properties

Speakers

Subscribe to Receive PyData Updates