Toggle navigation
Home
About
Conference Overview
About PyData
Conduct
Diversity
Job Board
Sponsor
Tickets
Schedule
Speakers
Name
Presentation(s)
Abhilash Babu
Using scikit-learn models in a C/C++ application
,
Lightning Talks
Achilleas Koutsou
gin@home: run your own research data management platform
,
Lightning Talks
Adam Chang
Highly-Scalable NLP to Answer Questions on South Africa’s COVID-19 WhatsApp Hotline
Adam Webber
Building a Data-Driven Product from Scratch, How Hard Can It Be?
Adam Zadrożny
Law, Graphs & Python
Adrien Treuille
Python Dashboarding Shootout and Showdown
Adway Dhillon
Serving BERT Models in Production with TorchServe
Aishwarya Agrawal
Interpretable ML models at scale
Alankrita Tewari
Anyone GAN do this: Solving the Minority Class Imbalance problem once and for all
Alejandro Saucedo
Accelerating ML Inference at Scale with ONNX, Triton and Seldon
Aleksander Molak
Modeling aleatoric and epistemic uncertainty using Tensorflow and Tensorflow Probability
Alex Wu
Unifying Large Scale Data Preprocessing and Machine Learning Pipelines with Ray Datasets
Allen Downey
Computational Survival Analysis
Alon Nir
Sliding into Causal Inference, with Python!
,
Snack-size Awesome Lists
,
Lightning Talks
Amy Wooding
Love your (data scientist) neighbour: Reproducible data science the Easydata way
Anders Berkeman
Computations as Assets - a New Approach to Reproducibility and Transparency
Andrew Garrow
A Platform to Enable Data Science At Scale in Tesco
Andrew Shao
Introduction to Unsupervised and Semi-Supervised Learning in TensorFlow
Andrey Cheptsov
From Jupyter Notebooks To JetBrains DataSpell
Ankit Rathi
Best Practices in Machine Learning Observability
Antoni Baum
Cutting edge hyperparameter tuning made simple with Ray Tune
April Rathe
Dask: From POC to Production
Ardo Illaste
It's not just about survival: using survival analysis to study customer behaviour
,
Lightning Talks
Aseel Addawood
Reasoning with Natural Language Processing: advancement in the interpretation of Arabic speech
Ashraf Ibrahim
Redefining Insurance with Predictive and Preventive Artificial Intelligence.
Austin Walters
What's in your data: Data Profiler - An Open Source Solution to Explain Your Data
,
What's in your data: Data Profiler - An Open Source Solution to Explain Your Data
Avi Aminov
Classifying Documents on a Graph using GNNs
Banjo Obayomi
Lightning Talks
,
Building an Open Source Topic Modeling Library
Barry Fitzgerald
Risk at Scale - Running a large investment risk system and how risk analysis techniques can help you
Benjamin Ajayi-Obe
An analysis of Societal Bias in SOTA NLP Transfer Learning
Benjamin Lehne
A Platform to Enable Data Science At Scale in Tesco
Benjamin Zaitlen
Data Processing at Scale
Ben Marwick
Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Brad Boehmke
Foundational Infrastructure to Create a Successful Data Science Team
Braden Riggs
Unlocking more from your Audio Data
,
Unlocking More From Your Audio Data
,
Lightning Talks
Brendan Collins
Spatial Analytics using Dask & Numba
,
Lightning Talks
Brian Cechmanek
Pointer Generator Summarisation and Explainability for Legal Documents
,
Lightning Talks
Bruno Gonçalves
Graphs for Data Science with NetworkX (pre-recorded)
Carl Drougge
Computations as Assets - a New Approach to Reproducibility and Transparency
Chase Ginther
Enterprise Machine Learning Pipelines with Unstructured Image Data
Chengxuan Wang
FugueSQL - The Enhanced SQL Interface for Pandas, Spark, and Dask DataFrames
Cheuk Ting Ho
Knowledge graph data modelling with TerminusDB
,
Turning Pandas DataFrames to Semantic Knowledge Graph
Chin Hwee Ong
Designing Functional Data Pipelines for Reproducibility and Maintainability
Chris Ostrouchov
Serving and Managing Reproducible Conda Environments via Conda-Store
Christian Juncker Brædstrup
Stau - lightweight job orchestration for data science workloads
,
Lightning Talks
Christopher Ariza
Why Datetimes Need Units: Avoiding a Y2262 Problem & Harnessing the Power of NumPy's datetime64
Christopher Lozinski
An Introduction to the Current Information War
Clair J. Sullivan
Working with Data in a Connected World: the Power of Graph Data Science
Clark Zinzow
Unifying Large Scale Data Preprocessing and Machine Learning Pipelines with Ray Datasets
Clark Zinzow
Unifying Large Scale Data Preprocessing and Machine Learning Pipelines with Ray Datasets
Cor Zuurmond
Wisdom of the Crowd: amplifying human intelligence with AI
Dana Averbuch
Take a Deep Berth and Let’s Dive Into the Matching Algorithm for Marina’s
,
Lightning Talks
Daniel Townsend
Managing your data using FastAPI and Piccolo Admin
Danny Chiao
Feature Stores: An operational bridge between machine learning models and data
David Hopes
An analysis of Societal Bias in SOTA NLP Transfer Learning
Dean Pleban
🦉DVC Showcase – Who Moved My Data? 🗂
Diego Arenas
Automating the Exploration of Databases for Data Science with AEDA
,
Lightning Talks
Dingqian (Sara) Liu
What do C-Suite Executives Pay Attention To?
Dipam Paul
Anyone GAN do this: Solving the Minority Class Imbalance problem once and for all
,
Anyone GAN do this: Solving the Minority Class Imbalance problem once and for all
Doris Jung-Lin Lee
Lux: Automatic Visualizations for Exploratory Data Science
Dor Kedem
Introduction to Distance Metric Learning
Dr. Brandeis Marshall (she/her)
Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Dr. Shrirang Ambaji Kulkarni
Impact of Noisy Data on Support Vector Machine and Deep Learning Algorithms on Edge Computing Device
Eduardo Blancas
Effective Testing for Machine Learning Projects
Ehsan Totoni
Bodo: Supercomputing-Like Performance and Scale for Python/Pandas
Eli Sander
Think Like Git
Emeli Dral
Is my data drifting? Early monitoring for machine learning models in production.
Eric Dill
conda-forge in 2021
Eric Ma
An attempt at demystifying graph deep learning
Ethan Swan
Foundational Infrastructure to Create a Successful Data Science Team
Evgeny Karev
Livemark: markdown for data journalism and documentation writing
,
Lightning Talks
Eyal Kazin
Start Asking Your Data “Why?” - A Gentle Introduction To Causal Inference
Farooq Shaikh
Building a Meta Forecasting Model with Prophet and LSTM for Time series Forecasting
,
Lightning Talks
Filip Jankovic
From Jupyter to Production: Deploying an Influenza Monitoring System at Scale with Wearable Sensors
Francesc Alted
Introducing Blosc2, the next generation of the Blosc compressor
Francesco Lässig
Darts for Time Series Forecasting
Francesco Tisiot
Get to know Apache Kafka with Jupyter Notebooks
,
Kickstart your Apache Kafka with Faker Data
,
Lightning Talks
Frits Hermans
DedupliPy: a new deduplication package
Gaby Lio
Data Science in the Enterprise: A Holistic Approach
Gatha
Visualizations for Privacy Preservation: The Balancing Act between Utility and Uncertainty
Gus Cavanaugh
High Performance Python With Numba, Dask, and Rapids For the Absolute Beginner
Gus Powers
Foundational Infrastructure to Create a Successful Data Science Team
Guzal Bulatova
Sktime - a Unified Toolbox for Machine Learning with Time Series Data
,
Lightning Talks
Han Wang
Large Scale Data Validation with Fugue
hassaku
Improving accessibility for data science with graph sonification library
,
Lightning Talks
Haw-minn Lu
Map Visualizations with Dash Leaflet
Hoda Rezaei
Innovating in the Oil & Gas Industry with AI/ML
Hugo Bowne-Anderson
Dask for Everyone
Hussain Sultan
Packaging PyData for Enterprise Software Supply Chain (pre-recording)
Huy Ngo
Towards Collaborative Reproducibility: Pinning Repository of Binary Distributions
Ilana Tuil
Productizing Wav2Vec 2.0: challenges and considerations
,
Lightning Talks
Iman Mossavat
Compressive Sensing
Imen Ayari
Software inspired workflow for Data Analysis
Irina Vidal Migallón
The prototype hole and tools to help you out of it
Ivana Feldfeber
Analyzing gender based violence data with Python
Jacob Schreiber
Submodular optimization for minimizing redundancy in massive data sets
Jacob Tomlinson
GPU development with Python 101
Jacob Zelko
A Visual Odyssey: Animations and Visualizations Made with Julia
,
Lightning Talks
Jacqueline Nolis
Enterprise Machine Learning Pipelines with Unstructured Image Data
,
Enterprise Machine Learning Pipelines with Unstructured Image Data
James A. Bednar
Python Dashboarding Shootout and Showdown
James Bourbeau
Data Processing at Scale
,
Snowflake & Dask: How to scale workloads using distributed fetch capabilities
James Laidler
Iguanas - A new rule generation and optimisation package
,
Lightning Talks
James Powell
So you wanna be a Pandas expert? | (Pre-recorded Tutorial)
,
So you wanna be a Pandas expert? | (Live Q&A)
Jason Lee
Improving Topic Model Interpretability Through Aggregation
,
Lightning Talks
Jeremy Goodsitt
What's in your data: Data Profiler - An Open Source Solution to Explain Your Data
Jeremy John Selva
Tips and advice when creating a python software for lab members to use in academia
,
Lightning Talks
Jeremy Tuloup
JupyterLite: Jupyter ❤️ WebAssembly ❤️ Python
Jesper Dramsch
How to Guarantee No One Understands What You Did in Your Machine Learning Project
Jiri Navratil
Uncertainty Quantification 360: A Hands-on Tutorial
John Cox
Using Generative Adversarial Networks (GANs) to Produce Elliptic Curves of Specified Rank
,
Lightning Talks
John McCambridge
Lightweight and Automated Exploratory Data Analysis
,
Lightning Talks
John Sandall
Agile Data Science: How To Implement Agile Workflows For Analytics & Machine Learning
Jonathan Bechtel
Behind the Black Box: How to Understand Any ML Model Using SHAP
Juan Luis Cano Rodríguez
Document your scientific project with Markdown, Sphinx, and Read the Docs
Juan Orduz
Exploring Tools for Interpretable Machine Learning
Jules S. Damji
Feature Stores: An operational bridge between machine learning models and data
Julia Kodysh
Data infrastructure at the COVID Tracking Project
Julien Herzen
Darts for Time Series Forecasting
Jun Liu
Fugue Tune: A Simple Interface for Distributed Hyperparmeter Optimization
Kalyan Munjuluri
Machine Learning Lifecycle Made Easy with MLflow
Kalyan Prasad
Neural Prophet – A powerful AI framework for Time Series Models
Karishma Babbar
Machine Learning Lifecycle Made Easy with MLflow
Kevin Anderson, Mark Mikofski, Abhishek Parikh, Silvana Ovaitt
Data and tools to model PV Systems
Kevin Duisters
Wisdom of the Crowd: amplifying human intelligence with AI
Kevin Jahns
Collaborative editing in Jupyter Notebook
Kevin Kho
An Intro to Workflow Management with Prefect
,
Simplifying Testing of Spark Applications
Kevin Stumpf
Snowflake and Tecton: How to build production-ready machine learning pipelines
Kirstie Whitaker
Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Kjell Wooding
Makefiles: One great trick for making your conda environments more managable.
Laura Jehl
Analyzing Company Filings for Stock Selection – a Practical Report
Layne Sadler
AIQC; deep learning experiment tracking with multi-dimensional pre/post-processing.
Liron Faybish (Ben-Kimon)
Keeping sensitive data safe using recommendation systems
Logan Kilpatrick
The importance of non-code open source contributions
,
Lightning Talks
Lucy Jiménez
Analyzing gender based violence data with Python
Madhur Tandon
JupyterLite: Jupyter ❤️ WebAssembly ❤️ Python
Magdalena Wiercioch
Deep learning-aided drug discovery
Malte Tichy
What could possibly go wrong when evaluating forecasts?
Manojit Nandi
Assessing and Mitigating Unfairness in AI Systems
Marcin Mosiolek
Deep Neural Deduplication
Marco Edward Gorelli
Let's Implement Bayesian Ordered Logistic Regression!
,
Run any Python code quality tool on a Jupyter Notebook!
,
Lightning Talks
Marek Suppa
Adapters: A neat (and production-enabling) trick for multi-task and multi-lingual NLP
,
Lightning Talks
Mark Keller
Snowflake & Dask: How to scale workloads using distributed fetch capabilities
Markus Löning
sktime - A Unified Toolbox for Machine Learning with Time Series
Martin Durant
All you need is zarr.: Parallel access to remote HDF5, TIFF, grib2 and others.
,
Data Processing at Scale
Martin Renou
JupyterLite: Jupyter ❤️ WebAssembly ❤️ Python
Martin Wanjiru
A first step from ad-hoc SQL to scalable ETL
Matthew Powers
5 Reasons Parquet files are better than CSV for data analyses
,
Data Processing at Scale
Max Ghenis
Fusing economic survey datasets with the synthimpute Python package
Meenal Jhajharia
Football Analytics Using Hierarchical Bayesian Models in PyMC
Megan Yow
Simplifying Testing of Spark Applications
Meirav Ben Izhak
Storytelling With Data – How To Turn a Basic Dataset Into a Compelling Story
Meredydd Luff
From Jupyter Notebook to Production Web App, with Anvil and (only) Python
Meshva Patel
Data Analytics in Healthcare
,
Lightning Talks
Michael Sonntag
gin@home: run your own research data management platform
,
Lightning Talks
Mika Pflüger
gin@home: run your own research data management platform
,
Lightning Talks
Mike McCarty
Legate Pandas: Scaling the Python ecosystem
Miki Tebeka
Lightning Talks
,
Using IPython as Presentation Tool
,
Lightning Talks
Milecia McGregor
Using Reproducible Experiments To Create Better Machine Learning Models
Miles Adkins
Snowflake and Tecton: How to build production-ready machine learning pipelines
,
Snowflake & Dask: How to scale workloads using distributed fetch capabilities
Mitali Sanwal
Getting started with Dask using Saturn Cloud
,
Getting started with Dask using Saturn Cloud
MUSASIZI FRANCIS KAMANZI
Image(face) Classification with Computer Vision and Python
Nabanita Roy
Getting Started with Text Classification: Predict if Tweets are about Real Disasters
Neel Surya
An online pedagogical tool for data science using Bokeh
,
Lightning Talks
Nguyễn Gia Phong
Towards Collaborative Reproducibility: Pinning Repository of Binary Distributions
Nicolas Kruchten
Why *Interactive* Data Visualization Matters for Data Science in Python
,
Python Dashboarding Shootout and Showdown
Nicolo Musmeci
Modelling and data visualisation for commodity trading
,
Lightning Talks
,
Lightning Talks
Nidhin Pattaniyil
Deploying a Mobile App on Tensorflow: Lessons Learned
,
Serving BERT Models in Production with TorchServe
Niels Bantilan
Robust, End-to-end Online Machine Learning Applications with Flyte, Pandera and Streamlit
Nik Agarwal
Best of Both Worlds: R & Python
,
Lightning Talks
Nimrita Koul
Data Analysis with Pandas and NumPy
Nir Barazida
📚 Notebook To Production 👷🏼
Olga Bane
Predictive modeling in a video advertising marketplace
Oliver Gindele
ML in Production – Serverless and Painless
Oliver Rieger
Machine learning in health: Predicting pregnancy complications
Omri Fima
Components, Workflows, and Cookbooks - Building Medical Grade AI pipelines with Argo Workflows
Oyidiya Oji
Tech for Posterity. Challenges for the Future of AI Ethics and DEI
Paco Nathan
Graph Thinking
Paul Klinger
Image classification in retail: Lessons from the real world
Philipp Rudiger
Build polished, data-driven applications directly from your Pandas or XArray pipelines
,
Python Dashboarding Shootout and Showdown
Pranav Kompally
Project Nirvana : A Podcast Summariser
,
Lightning Talks
Prasanna Sattigeri
Uncertainty Quantification 360: A Hands-on Tutorial
Quan Nguyen
Making the Perfect Cup of Joe: Active Preference Learning and Optimization Under Uncertainty
Rachel shalom
Deep Learning for Tabular Data
Raghuram Thiagarajan
An online pedagogical tool for data science using Bokeh
,
Lightning Talks
Ramiro Caro
What to do when you can't trust your labels? A practical approach
Reshama Shaikh
Deploying a Mobile App on Tensorflow: Lessons Learned
Richard Zamora
Data Processing at Scale
Ritchie Vink
Polars, the fastest DataFrame library you never heard of.
Robert Meyer
Lessons learned from deploying Machine Learning in an old-fashioned heavy industry
Rongpeng Li (Ron)
Can the “Best” Language Model Detect Logical Fallacies?
,
Lightning Talks
Ross Hart
Building linear programs with ORTools
Ruben Mak
Some Attention for Attenuation Bias
Ryan Soklaski
hydra-zen: Configurable, Reproducible, and Scalable Computing with Hydra
,
MyGrad: Drop-in automatic differentiation for NumPy
,
Lightning Talks
Sandhya Prabhakaran
Sparcle: assigning transcripts to cells in multiplexed images
Sanket Verma
PyData Meetup Organisers Social
Sarah Krasnik
Dev, Staging, and Production in Data Engineering with Terraform
Sarah Schuhegger
Automatic body part recognition for medical images with python
,
Lightning Talks
Sara Stoudt
Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Sara Tähtinen
Computer vision and xAI: explaining a single prediction with visualisations and examples
Sebastian Lautz
Extracting complements and substitutes from sales data - a network perspective
Sebastian M. Ernst
Amazing things your (Unix) operating system can do for you: POSIX shared memory
,
Lightning Talks
Sergii Mikhtoniuk
Time: The most misunderstood dimension in data modelling
Seth Shelnutt
Extending Jupyter Data Visualizations Beyond the Notebook
Shagun Sodhani
Profiling and Tuning PyTorch Models
Shashank Shekhar
Counter Factual Analysis for Explainable AI
Shivay Lamba
Tensorflow For the Web : Converting Python Machine Learning Models to Javascript using TFJS Converte
,
Lightning Talks
Simona Maggio
Lightning Talks
,
Stress Test Center: moving the stress from the user to the model
Sin-seok SEO
Know Your Data First: An Introduction to Exploratory Data Analysis
Sivan Biham
Wounds Over Time - Tracking Wound Healing via 3D Models
Smriti Singh
NLP and Hate speech: Why does it matter and what can we do?
Sofia Hörberg
Computations as Assets - a New Approach to Reproducibility and Transparency
Soumya Ghosh
Uncertainty Quantification 360: A Hands-on Tutorial
Stavros Papadopoulos
TileDB and the New Data Economics
Steven Kolawole
Building a Sign-to-Speech prototype with TensorFlow and DeepStack: How it happened & What I learned
Suliman Sharif
Using a Pythonic Compass to Link the Physics Community to the Chemistry Community
Sune Debel
Functional, Composable, Asynchronous, Type-Safe Python
Svea Marie Meyer
Sktime - a Unified Toolbox for Machine Learning with Time Series Data
,
Lightning Talks
,
Lightning Talks
Sven Mika
Large-Scale Production Reinforcement Learning with RLlib
Sylvain Corlay
Python Dashboarding Shootout and Showdown
Sylvia Lee
Bridging Data and Business: Power Plant Output Optimization Based on Electricity Market Price
,
Digital Hub™ Platform : An Innovative Data Science Toolkit
Temiloluwa Adeniyi
So Much Data, Such Poor Quality
Thibault Lestang
Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Thorsten Beier
JupyterLite: Jupyter ❤️ WebAssembly ❤️ Python
Timo Metzger
What's new in Bokeh 2.4
,
Lightning Talks
Timothy Odom
An online pedagogical tool for data science using Bokeh
,
Lightning Talks
Tomás Capretto
Bambi: A new library for Bayesian modeling in Python.
,
Lightning Talks
Tom Augspurger
Scalable Sustainability with the Planetary Computer
Tonya Sims
Faceoff Fun with Python Frameworks: FastAPI vs Flask 2.0
Usman Kamran
The Need for Modular Data Science Solutions
,
Lightning Talks
Utkarsh Mishra
Python and Flutter application for Colouring and Enhancing Old Photos
Valentina Bono
Image classification in retail: Lessons from the real world
Valentin Danchev
Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Vasu Sharma
Simplification as a Service
VELEZ RUEDA ANA JULIA
Analyzing gender based violence data with Python
Vicente Ruben Del Pino Ruiz
Introduction to Quantum Computing with Python and Qiskit
Vini Jaiswal
Data Engineering for successful Machine Learning
Violeta Misheva
Pragmatic Advice for Implementing Responsible Machine Learning: Technical and Organizational Best Pr
,
Lightning Talks
Wojtek Kuberski
How to detect silent model failures?
Xiuwen Tu
Best approximation of pi - an investigation with Python
,
Lightning Talks
Xu Chen
An online pedagogical tool for data science using Bokeh
,
Lightning Talks
Yacine Jernite
Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Yatin Bhatia
Best Practices in Machine Learning Observability
Yuan Tang
Towards Cloud-Native Distributed Machine Learning Pipelines at Scale (pre-recorded)
Zachary Blackwood
Automatic Pattern-Based Data Catalogs
,
Lightning Talks
Zayd Ma
Packaging PyData for Enterprise Software Supply Chain (pre-recording)