Speakers

Name Presentation(s)
Aditi Khullar Introduction to Language Modeling
Aditya Lahiri Dealing With Imbalanced Classes in Machine Learning
Agata Sumowska Role playing Annotation workshop
Akos Furton Production Code in Data Science Consulting
Alex Egg Discover your latent food graph with this 1 weird trick
Allen Downey The Inspection Paradox is Everywhere, Unconference - 10:55: Contributing to Open Source, 11:40: Data Science Education
Amanda Moran Pandas vs Koalas: The Ultimate Showdown!
Anne Bauer Data science at The New York Times: a mission-driven approach to personalizing the customer journey, Fireside Chat
Anthony Scopatz Conda-press, or Reinventing the Wheel, What's now in NumFOCUS projects? (Part 2), Stump the Chump
Bhargav Srinivasa Desikan Role playing Annotation workshop
Bhargav Srinivasa Desikan Role playing Annotation workshop
Bill Lynch Bringing mental health data to doctors
Breck Baldwin Introduction to Bayesian Modeling with Stan: No Statistics Background Required (Pt 1), Introduction to Bayesian Modeling with Stan: No Statistics Background Required (Pt 2), Introduction to Bayesian Modeling with Stan: No Statistics Background Required (Pt 3)
Brian Austin How and why to put your Jupyter notebooks in Docker containers
Bryan Cross Panel Discussion: My First Open Source Contribution
Bryan Van de Ven What's now in NumFOCUS projects? (Part 2)
Cameron Davidson-Pilon New Trends in Estimation and Inference
Carlos Afonso Visualizing the 2019 Measles Outbreak in NYC (with Python)
Chaya D Stern Using Graph Nets (GNs) to predict molecular properties
Chris Fonnesbeck A Primer on Gaussian Processes for Regression Analysis, Panel Discussion: My First Open Source Contribution
Christian Steiner Improve the efficiency of your Big Data application
Christopher J "CJ" Wright conda-forge sprint, conda-forge sprint
Chris Wiggins Data science at The New York Times: a mission-driven approach to personalizing the customer journey, Fireside Chat
Colin Carroll Building a maintainable plotting library, What's now in NumFOCUS projects? (Part 1)
Daniel Rodriguez Effective Python and R collaboration
Dante Gama Dessavre High-Performance Data Science at Scale with RAPIDS, Dask, and GPUs
David Palaitis Panel Discussion: Pitching Open Source Up Your Management Chain
Debra Williams Cauley Discussion: What’s Missing in Python and Data Online Resources?
Deepyaman Datta New and Upcoming
Diego Torres Quintanilla Cleaning, optimizing and windowing pandas with numba
Emily A Ray Discover your latent food graph with this 1 weird trick
Eric Dill Is Spark still relevant? Multi-node CPU and single-node GPU workloads with Spark, Dask and RAPIDS.
Ethan Rosenthal Time series for scikit-learn people
Eugene Tang Introduction to Language Modeling
Evan Patterson Semantic modeling of data science code
Ferras Hamad Reproducibility in ML Systems: A Netflix Original
Francesc Alted Improve the efficiency of your Big Data application, What's now in NumFOCUS projects? (Part 1)
Gil Forsyth Panel Discussion: Pitching Open Source Up Your Management Chain
Hannah Aizenman What's now in NumFOCUS projects? (Part 2), Matplotlib sprint, Matplotlib sprint, Panel Discussion: My First Open Source Contribution, Building a maintainable plotting library, HoloViz and Matplotlib sprint, HoloViz and Matplotlib sprint, Jupyter & matplotlib sprint, Jupyter & matplotlib sprint
Hillary Green-Lerman How to Prove You’re Right: A/B Testing with SciPy, Hacking the Data Science Challenge
Hyonjee Joo Spark Backend for Ibis: Seamless Transition Between Pandas and Spark
Ian Whalen Implementing Lightweight Random Indexing for Polylingual Text Classification
Itamar Turner-Trauring Small Big Data: using NumPy and Pandas when your data doesn't fit in memory
Jacqueline Gutman Simplified Data Quality Monitoring of Dynamic Longitudinal Data: A Functional Programming Approach
James Powell PyData Pop Quiz, Sloth & ENVy
Jason Grout Jupyter sprint, Jupyter sprint, Jupyter & matplotlib sprint, Jupyter & matplotlib sprint
Jeff Reback Pandas sprint, Pandas sprint, Introduction to pandas
Jenny Turner-Trauring Working with Maps: Extracting Features for Traffic Crash Insights
Jens Fredrik Skogstrom What we learned by running a large custom Bayesian forecasting model in production
Jessica Tyler Geo Experiments and CausalImpact in Incrementality Testing
Jim Weiss Example
Joe Jevnik Zarr vs. HDF5
Joseph Kearney New and Upcoming
Joshua Falk Generating realistic, differentially private data sets using GANs
Julia Signell Data-centric exploration using intake, dask, hvplot, datashader, panel, and binder, HoloViz and Matplotlib sprint, HoloViz and Matplotlib sprint, Panel Discussion: My First Open Source Contribution, HoloViz and Matplotlib sprint
Kamal Abdelrahman Painting A Picture of Public Data
Katherine Kampf Build an AI-powered Pet Detector in Visual Studio Code
Keith Kraus High-Performance Data Science at Scale with RAPIDS, Dask, and GPUs
Kelle Cruz AstroPy sprint, AstroPy sprint
Kelly Jin A Few Good Public Servants: How Great Analysis Inspires Action, Fireside Chat
Kelly Shen Julia for Pythonistas
Kevin Fleming Panel Discussion: Pitching Open Source Up Your Management Chain
Lara Kattan Machine learning from scratch using the scientific Python stack
Laurence Warner Role playing Annotation workshop
Lauren Oldja Managing Stakeholders: The Key to Successful Data Science for Business
Lev Konstantinovskiy Role playing Annotation workshop
Li Jin Spark Backend for Ibis: Seamless Transition Between Pandas and Spark
Maciej Wojton Free Your Esoteric Data Using Apache Arrow and Python
Malaika Handa Colorism in High Fashion (featuring: K-Means Clustering)
Marc Garcia Introduction to pandas
Marianne Hoogeveen The physics of deep learning using tensor networks
Mariel Frank Introduction to NLP
Marius van Niekerk conda-forge sprint, conda-forge sprint, A How-to guide for migrating legacy data applications
Martin Hirzel Type-Driven Automated Learning with Lale
Matti Lyra [SCHEDULE CHANGE 12:45PM - 2:15PM] Neural Networks for Natural Language Processing
Meghan Heintz Launching a new warehouse with SimPy at Rent the Runway
Michael Johns Propensity Score Matching: A Non-experimental Approach to Causal Inference
Michael Skarlinski New and Upcoming
Michal Mucha Swiftly turn Jupyter notebooks into pretty web apps
Michoel Snow Hacking the Data Science Challenge, How to Prove You’re Right: A/B Testing with SciPy
Mike McCarty Panel Discussion: Pitching Open Source Up Your Management Chain
Mitzi Morris Bayesian Inference for Fun and Profit
Moussa Taifi Ph.D. Clean Machine Learning Code: Practical Software Engineering Principles for ML Craftsmanship
Natu Lauchande Machine Learning Engineering principles with Python and MLFlow
Nick Becker High-Performance Data Science at Scale with RAPIDS, Dask, and GPUs
Noam Ross Building Software and Communities With Peer Review
Parin Choganwala Discover your latent food graph with this 1 weird trick
Patrick Landreman A Crash Course in Applied Linear Algebra
Paul Ganssle Stump the Chump
Petr Wolf From Raw Recruit Scripts to Perfect Python (in 90 minutes)
Piero Ferrante Should I develop my own DS library? Maybe.
Ralf Gommers What's now in NumFOCUS projects? (Part 1)
Raman Tehlan Genetic algorithms: Making errors do all the work
Raoul-Gabriel Urma Advanced Software Testing for Data Scientists
Raphaël Meudec tf-explain: Interpretability for Tensorflow 2.0
Rohit Kapur A How-to guide for migrating legacy data applications
Romain Cledat Every ML Model Deserves To Be A Full Micro-service
Ryan Abernathey What's now in NumFOCUS projects? (Part 1)
Samuel Rochette Quantifying uncertainty in machine learning models
Sara Seager Stars, Planets, and Python, Fireside Chat
Saul Shanabrook Same API, Different Execution, Jupyter sprint, Jupyter sprint, Jupyter & matplotlib sprint, Jupyter & matplotlib sprint
Sean Law New and Upcoming, New and Upcoming
Stanley van der Merwe From Raw Recruit Scripts to Perfect Python (in 90 minutes)
Steve Dower Unconference - 10:55: Contributing to Open Source, 11:40: Data Science Education, The Secret Life of Python, Panel Discussion: My First Open Source Contribution
Tamar Yastrab The Echo-Chamber of Your Social Media Feed
Thomas Caswell Stump the Chump, Building a maintainable plotting library, HoloViz and Matplotlib sprint, HoloViz and Matplotlib sprint, Jupyter & matplotlib sprint, Jupyter & matplotlib sprint, Matplotlib sprint, Matplotlib sprint
Thomas J Fan Deep Dive into scikit-learn's HistGradientBoosting Classifier and Regressor, What's now in NumFOCUS projects? (Part 2)
Tom Augspurger Scalable Machine Learning with Dask, What's now in NumFOCUS projects? (Part 1), Introduction to pandas
Veronica Hanus To comment or not
Will Kurt An Introduction to Probability and Statistics
YOU! Unconference, Unconference, 1:20pm-2pm: A Github for Data, 2pm-3:25pm: OPEN, 3:40pm-5pm: Role Play annotation facilitator training, 5:10pm-5:50pm: STUMPY & Time-series analysis, Unconference - Using GPUs in Python/Julia/R Applications, Unconference - 2:10: Software Engineering for Data Scientists, 2:55: Equity and Algorithmic Fairness, Unconference - Native Extension Modules, Unconference - 3:45: Design Thinking for Data Science, 4:30: Open
Yuanqing Wang Using Graph Nets (GNs) to predict molecular properties

Subscribe to Receive PyData Updates

Subscribe