Speakers

Name Presentation(s)
Abhilash Babu Using scikit-learn models in a C/C++ application, Lightning Talks
Achilleas Koutsou gin@home: run your own research data management platform, Lightning Talks
Adam Chang Highly-Scalable NLP to Answer Questions on South Africa’s COVID-19 WhatsApp Hotline
Adam Webber Building a Data-Driven Product from Scratch, How Hard Can It Be?
Adam Zadrożny Law, Graphs & Python
Adrien Treuille Python Dashboarding Shootout and Showdown
Adway Dhillon Serving BERT Models in Production with TorchServe
Aishwarya Agrawal Interpretable ML models at scale
Alankrita Tewari Anyone GAN do this: Solving the Minority Class Imbalance problem once and for all
Alejandro Saucedo Accelerating ML Inference at Scale with ONNX, Triton and Seldon
Aleksander Molak Modeling aleatoric and epistemic uncertainty using Tensorflow and Tensorflow Probability
Alex Wu Unifying Large Scale Data Preprocessing and Machine Learning Pipelines with Ray Datasets
Allen Downey Computational Survival Analysis
Alon Nir Sliding into Causal Inference, with Python!, Snack-size Awesome Lists, Lightning Talks
Amy Wooding Love your (data scientist) neighbour: Reproducible data science the Easydata way
Anders Berkeman Computations as Assets - a New Approach to Reproducibility and Transparency
Andrew Garrow A Platform to Enable Data Science At Scale in Tesco
Andrew Shao Introduction to Unsupervised and Semi-Supervised Learning in TensorFlow
Andrey Cheptsov From Jupyter Notebooks To JetBrains DataSpell
Ankit Rathi Best Practices in Machine Learning Observability
Antoni Baum Cutting edge hyperparameter tuning made simple with Ray Tune
April Rathe Dask: From POC to Production
Ardo Illaste It's not just about survival: using survival analysis to study customer behaviour, Lightning Talks
Aseel Addawood Reasoning with Natural Language Processing: advancement in the interpretation of Arabic speech
Ashraf Ibrahim Redefining Insurance with Predictive and Preventive Artificial Intelligence.
Austin Walters What's in your data: Data Profiler - An Open Source Solution to Explain Your Data, What's in your data: Data Profiler - An Open Source Solution to Explain Your Data
Avi Aminov Classifying Documents on a Graph using GNNs
Banjo Obayomi Lightning Talks, Building an Open Source Topic Modeling Library
Barry Fitzgerald Risk at Scale - Running a large investment risk system and how risk analysis techniques can help you
Benjamin Ajayi-Obe An analysis of Societal Bias in SOTA NLP Transfer Learning
Benjamin Lehne A Platform to Enable Data Science At Scale in Tesco
Benjamin Zaitlen Data Processing at Scale
Ben Marwick Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Brad Boehmke Foundational Infrastructure to Create a Successful Data Science Team
Braden Riggs Unlocking more from your Audio Data, Unlocking More From Your Audio Data, Lightning Talks
Brendan Collins Spatial Analytics using Dask & Numba, Lightning Talks
Brian Cechmanek Pointer Generator Summarisation and Explainability for Legal Documents, Lightning Talks
Bruno Gonçalves Graphs for Data Science with NetworkX (pre-recorded)
Carl Drougge Computations as Assets - a New Approach to Reproducibility and Transparency
Chase Ginther Enterprise Machine Learning Pipelines with Unstructured Image Data
Chengxuan Wang FugueSQL - The Enhanced SQL Interface for Pandas, Spark, and Dask DataFrames
Cheuk Ting Ho Knowledge graph data modelling with TerminusDB, Turning Pandas DataFrames to Semantic Knowledge Graph
Chin Hwee Ong Designing Functional Data Pipelines for Reproducibility and Maintainability
Chris Ostrouchov Serving and Managing Reproducible Conda Environments via Conda-Store
Christian Juncker Brædstrup Stau - lightweight job orchestration for data science workloads, Lightning Talks
Christopher Ariza Why Datetimes Need Units: Avoiding a Y2262 Problem & Harnessing the Power of NumPy's datetime64
Christopher Lozinski An Introduction to the Current Information War
Clair J. Sullivan Working with Data in a Connected World: the Power of Graph Data Science
Clark Zinzow Unifying Large Scale Data Preprocessing and Machine Learning Pipelines with Ray Datasets
Clark Zinzow Unifying Large Scale Data Preprocessing and Machine Learning Pipelines with Ray Datasets
Cor Zuurmond Wisdom of the Crowd: amplifying human intelligence with AI
Dana Averbuch Take a Deep Berth and Let’s Dive Into the Matching Algorithm for Marina’s, Lightning Talks
Daniel Townsend Managing your data using FastAPI and Piccolo Admin
Danny Chiao Feature Stores: An operational bridge between machine learning models and data
David Hopes An analysis of Societal Bias in SOTA NLP Transfer Learning
Dean Pleban 🦉DVC Showcase – Who Moved My Data? 🗂
Diego Arenas Automating the Exploration of Databases for Data Science with AEDA, Lightning Talks
Dingqian (Sara) Liu What do C-Suite Executives Pay Attention To?
Dipam Paul Anyone GAN do this: Solving the Minority Class Imbalance problem once and for all, Anyone GAN do this: Solving the Minority Class Imbalance problem once and for all
Doris Jung-Lin Lee Lux: Automatic Visualizations for Exploratory Data Science
Dor Kedem Introduction to Distance Metric Learning
Dr. Brandeis Marshall (she/her) Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Dr. Shrirang Ambaji Kulkarni Impact of Noisy Data on Support Vector Machine and Deep Learning Algorithms on Edge Computing Device
Eduardo Blancas Effective Testing for Machine Learning Projects
Ehsan Totoni Bodo: Supercomputing-Like Performance and Scale for Python/Pandas
Eli Sander Think Like Git
Emeli Dral Is my data drifting? Early monitoring for machine learning models in production.
Eric Dill conda-forge in 2021
Eric Ma An attempt at demystifying graph deep learning
Ethan Swan Foundational Infrastructure to Create a Successful Data Science Team
Evgeny Karev Livemark: markdown for data journalism and documentation writing, Lightning Talks
Eyal Kazin Start Asking Your Data “Why?” - A Gentle Introduction To Causal Inference
Farooq Shaikh Building a Meta Forecasting Model with Prophet and LSTM for Time series Forecasting, Lightning Talks
Filip Jankovic From Jupyter to Production: Deploying an Influenza Monitoring System at Scale with Wearable Sensors
Francesc Alted Introducing Blosc2, the next generation of the Blosc compressor
Francesco Lässig Darts for Time Series Forecasting
Francesco Tisiot Get to know Apache Kafka with Jupyter Notebooks, Kickstart your Apache Kafka with Faker Data, Lightning Talks
Frits Hermans DedupliPy: a new deduplication package
Gaby Lio Data Science in the Enterprise: A Holistic Approach
Gatha Visualizations for Privacy Preservation: The Balancing Act between Utility and Uncertainty
Gus Cavanaugh High Performance Python With Numba, Dask, and Rapids For the Absolute Beginner
Gus Powers Foundational Infrastructure to Create a Successful Data Science Team
Guzal Bulatova Sktime - a Unified Toolbox for Machine Learning with Time Series Data, Lightning Talks
Han Wang Large Scale Data Validation with Fugue
hassaku Improving accessibility for data science with graph sonification library, Lightning Talks
Haw-minn Lu Map Visualizations with Dash Leaflet
Hoda Rezaei Innovating in the Oil & Gas Industry with AI/ML
Hugo Bowne-Anderson Dask for Everyone
Hussain Sultan Packaging PyData for Enterprise Software Supply Chain (pre-recording)
Huy Ngo Towards Collaborative Reproducibility: Pinning Repository of Binary Distributions
Ilana Tuil Productizing Wav2Vec 2.0: challenges and considerations, Lightning Talks
Iman Mossavat Compressive Sensing
Imen Ayari Software inspired workflow for Data Analysis
Irina Vidal Migallón The prototype hole and tools to help you out of it
Ivana Feldfeber Analyzing gender based violence data with Python
Jacob Schreiber Submodular optimization for minimizing redundancy in massive data sets
Jacob Tomlinson GPU development with Python 101
Jacob Zelko A Visual Odyssey: Animations and Visualizations Made with Julia, Lightning Talks
Jacqueline Nolis Enterprise Machine Learning Pipelines with Unstructured Image Data, Enterprise Machine Learning Pipelines with Unstructured Image Data
James A. Bednar Python Dashboarding Shootout and Showdown
James Bourbeau Data Processing at Scale, Snowflake & Dask: How to scale workloads using distributed fetch capabilities
James Laidler Iguanas - A new rule generation and optimisation package, Lightning Talks
James Powell So you wanna be a Pandas expert? | (Pre-recorded Tutorial), So you wanna be a Pandas expert? | (Live Q&A)
Jason Lee Improving Topic Model Interpretability Through Aggregation, Lightning Talks
Jeremy Goodsitt What's in your data: Data Profiler - An Open Source Solution to Explain Your Data
Jeremy John Selva Tips and advice when creating a python software for lab members to use in academia, Lightning Talks
Jeremy Tuloup JupyterLite: Jupyter ❤️ WebAssembly ❤️ Python
Jesper Dramsch How to Guarantee No One Understands What You Did in Your Machine Learning Project
Jiri Navratil Uncertainty Quantification 360: A Hands-on Tutorial
John Cox Using Generative Adversarial Networks (GANs) to Produce Elliptic Curves of Specified Rank, Lightning Talks
John McCambridge Lightweight and Automated Exploratory Data Analysis, Lightning Talks
John Sandall Agile Data Science: How To Implement Agile Workflows For Analytics & Machine Learning
Jonathan Bechtel Behind the Black Box: How to Understand Any ML Model Using SHAP
Juan Luis Cano Rodríguez Document your scientific project with Markdown, Sphinx, and Read the Docs
Juan Orduz Exploring Tools for Interpretable Machine Learning
Jules S. Damji Feature Stores: An operational bridge between machine learning models and data
Julia Kodysh Data infrastructure at the COVID Tracking Project
Julien Herzen Darts for Time Series Forecasting
Jun Liu Fugue Tune: A Simple Interface for Distributed Hyperparmeter Optimization
Kalyan Munjuluri Machine Learning Lifecycle Made Easy with MLflow
Kalyan Prasad Neural Prophet – A powerful AI framework for Time Series Models
Karishma Babbar Machine Learning Lifecycle Made Easy with MLflow
Kevin Anderson, Mark Mikofski, Abhishek Parikh, Silvana Ovaitt Data and tools to model PV Systems
Kevin Duisters Wisdom of the Crowd: amplifying human intelligence with AI
Kevin Jahns Collaborative editing in Jupyter Notebook
Kevin Kho An Intro to Workflow Management with Prefect, Simplifying Testing of Spark Applications
Kevin Stumpf Snowflake and Tecton: How to build production-ready machine learning pipelines
Kirstie Whitaker Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Kjell Wooding Makefiles: One great trick for making your conda environments more managable.
Laura Jehl Analyzing Company Filings for Stock Selection – a Practical Report
Layne Sadler AIQC; deep learning experiment tracking with multi-dimensional pre/post-processing.
Liron Faybish (Ben-Kimon) Keeping sensitive data safe using recommendation systems
Logan Kilpatrick The importance of non-code open source contributions, Lightning Talks
Lucy Jiménez Analyzing gender based violence data with Python
Madhur Tandon JupyterLite: Jupyter ❤️ WebAssembly ❤️ Python
Magdalena Wiercioch Deep learning-aided drug discovery
Malte Tichy What could possibly go wrong when evaluating forecasts?
Manojit Nandi Assessing and Mitigating Unfairness in AI Systems
Marcin Mosiolek Deep Neural Deduplication
Marco Edward Gorelli Let's Implement Bayesian Ordered Logistic Regression!, Run any Python code quality tool on a Jupyter Notebook!, Lightning Talks
Marek Suppa Adapters: A neat (and production-enabling) trick for multi-task and multi-lingual NLP, Lightning Talks
Mark Keller Snowflake & Dask: How to scale workloads using distributed fetch capabilities
Markus Löning sktime - A Unified Toolbox for Machine Learning with Time Series
Martin Durant All you need is zarr.: Parallel access to remote HDF5, TIFF, grib2 and others., Data Processing at Scale
Martin Renou JupyterLite: Jupyter ❤️ WebAssembly ❤️ Python
Martin Wanjiru A first step from ad-hoc SQL to scalable ETL
Matthew Powers 5 Reasons Parquet files are better than CSV for data analyses, Data Processing at Scale
Max Ghenis Fusing economic survey datasets with the synthimpute Python package
Meenal Jhajharia Football Analytics Using Hierarchical Bayesian Models in PyMC
Megan Yow Simplifying Testing of Spark Applications
Meirav Ben Izhak Storytelling With Data – How To Turn a Basic Dataset Into a Compelling Story
Meredydd Luff From Jupyter Notebook to Production Web App, with Anvil and (only) Python
Meshva Patel Data Analytics in Healthcare, Lightning Talks
Michael Sonntag gin@home: run your own research data management platform, Lightning Talks
Mika Pflüger gin@home: run your own research data management platform, Lightning Talks
Mike McCarty Legate Pandas: Scaling the Python ecosystem
Miki Tebeka Lightning Talks, Using IPython as Presentation Tool, Lightning Talks
Milecia McGregor Using Reproducible Experiments To Create Better Machine Learning Models
Miles Adkins Snowflake and Tecton: How to build production-ready machine learning pipelines, Snowflake & Dask: How to scale workloads using distributed fetch capabilities
Mitali Sanwal Getting started with Dask using Saturn Cloud, Getting started with Dask using Saturn Cloud
MUSASIZI FRANCIS KAMANZI Image(face) Classification with Computer Vision and Python
Nabanita Roy Getting Started with Text Classification: Predict if Tweets are about Real Disasters
Neel Surya An online pedagogical tool for data science using Bokeh, Lightning Talks
Nguyễn Gia Phong Towards Collaborative Reproducibility: Pinning Repository of Binary Distributions
Nicolas Kruchten Why *Interactive* Data Visualization Matters for Data Science in Python, Python Dashboarding Shootout and Showdown
Nicolo Musmeci Modelling and data visualisation for commodity trading, Lightning Talks, Lightning Talks
Nidhin Pattaniyil Deploying a Mobile App on Tensorflow: Lessons Learned, Serving BERT Models in Production with TorchServe
Niels Bantilan Robust, End-to-end Online Machine Learning Applications with Flyte, Pandera and Streamlit
Nik Agarwal Best of Both Worlds: R & Python, Lightning Talks
Nimrita Koul Data Analysis with Pandas and NumPy
Nir Barazida 📚 Notebook To Production 👷🏼
Olga Bane Predictive modeling in a video advertising marketplace
Oliver Gindele ML in Production – Serverless and Painless
Oliver Rieger Machine learning in health: Predicting pregnancy complications
Omri Fima Components, Workflows, and Cookbooks - Building Medical Grade AI pipelines with Argo Workflows
Oyidiya Oji Tech for Posterity. Challenges for the Future of AI Ethics and DEI
Paco Nathan Graph Thinking
Paul Klinger Image classification in retail: Lessons from the real world
Philipp Rudiger Build polished, data-driven applications directly from your Pandas or XArray pipelines, Python Dashboarding Shootout and Showdown
Pranav Kompally Project Nirvana : A Podcast Summariser, Lightning Talks
Prasanna Sattigeri Uncertainty Quantification 360: A Hands-on Tutorial
Quan Nguyen Making the Perfect Cup of Joe: Active Preference Learning and Optimization Under Uncertainty
Rachel shalom Deep Learning for Tabular Data
Raghuram Thiagarajan An online pedagogical tool for data science using Bokeh, Lightning Talks
Ramiro Caro What to do when you can't trust your labels? A practical approach
Reshama Shaikh Deploying a Mobile App on Tensorflow: Lessons Learned
Richard Zamora Data Processing at Scale
Ritchie Vink Polars, the fastest DataFrame library you never heard of.
Robert Meyer Lessons learned from deploying Machine Learning in an old-fashioned heavy industry
Rongpeng Li (Ron) Can the “Best” Language Model Detect Logical Fallacies?, Lightning Talks
Ross Hart Building linear programs with ORTools
Ruben Mak Some Attention for Attenuation Bias
Ryan Soklaski hydra-zen: Configurable, Reproducible, and Scalable Computing with Hydra, MyGrad: Drop-in automatic differentiation for NumPy, Lightning Talks
Sandhya Prabhakaran Sparcle: assigning transcripts to cells in multiplexed images
Sanket Verma PyData Meetup Organisers Social
Sarah Krasnik Dev, Staging, and Production in Data Engineering with Terraform
Sarah Schuhegger Automatic body part recognition for medical images with python, Lightning Talks
Sara Stoudt Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Sara Tähtinen Computer vision and xAI: explaining a single prediction with visualisations and examples
Sebastian Lautz Extracting complements and substitutes from sales data - a network perspective
Sebastian M. Ernst Amazing things your (Unix) operating system can do for you: POSIX shared memory, Lightning Talks
Sergii Mikhtoniuk Time: The most misunderstood dimension in data modelling
Seth Shelnutt Extending Jupyter Data Visualizations Beyond the Notebook
Shagun Sodhani Profiling and Tuning PyTorch Models
Shashank Shekhar Counter Factual Analysis for Explainable AI
Shivay Lamba Tensorflow For the Web : Converting Python Machine Learning Models to Javascript using TFJS Converte, Lightning Talks
Simona Maggio Lightning Talks, Stress Test Center: moving the stress from the user to the model
Sin-seok SEO Know Your Data First: An Introduction to Exploratory Data Analysis
Sivan Biham Wounds Over Time - Tracking Wound Healing via 3D Models
Smriti Singh NLP and Hate speech: Why does it matter and what can we do?
Sofia Hörberg Computations as Assets - a New Approach to Reproducibility and Transparency
Soumya Ghosh Uncertainty Quantification 360: A Hands-on Tutorial
Stavros Papadopoulos TileDB and the New Data Economics
Steven Kolawole Building a Sign-to-Speech prototype with TensorFlow and DeepStack: How it happened & What I learned
Suliman Sharif Using a Pythonic Compass to Link the Physics Community to the Chemistry Community
Sune Debel Functional, Composable, Asynchronous, Type-Safe Python
Svea Marie Meyer Sktime - a Unified Toolbox for Machine Learning with Time Series Data, Lightning Talks, Lightning Talks
Sven Mika Large-Scale Production Reinforcement Learning with RLlib
Sylvain Corlay Python Dashboarding Shootout and Showdown
Sylvia Lee Bridging Data and Business: Power Plant Output Optimization Based on Electricity Market Price, Digital Hub™ Platform : An Innovative Data Science Toolkit
Temiloluwa Adeniyi So Much Data, Such Poor Quality
Thibault Lestang Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Thorsten Beier JupyterLite: Jupyter ❤️ WebAssembly ❤️ Python
Timo Metzger What's new in Bokeh 2.4, Lightning Talks
Timothy Odom An online pedagogical tool for data science using Bokeh, Lightning Talks
Tomás Capretto Bambi: A new library for Bayesian modeling in Python., Lightning Talks
Tom Augspurger Scalable Sustainability with the Planetary Computer
Tonya Sims Faceoff Fun with Python Frameworks: FastAPI vs Flask 2.0
Usman Kamran The Need for Modular Data Science Solutions, Lightning Talks
Utkarsh Mishra Python and Flutter application for Colouring and Enhancing Old Photos
Valentina Bono Image classification in retail: Lessons from the real world
Valentin Danchev Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Vasu Sharma Simplification as a Service
VELEZ RUEDA ANA JULIA Analyzing gender based violence data with Python
Vicente Ruben Del Pino Ruiz Introduction to Quantum Computing with Python and Qiskit
Vini Jaiswal Data Engineering for successful Machine Learning
Violeta Misheva Pragmatic Advice for Implementing Responsible Machine Learning: Technical and Organizational Best Pr, Lightning Talks
Wojtek Kuberski How to detect silent model failures?
Xiuwen Tu Best approximation of pi - an investigation with Python, Lightning Talks
Xu Chen An online pedagogical tool for data science using Bokeh, Lightning Talks
Yacine Jernite Building Responsible Data Science Workflows: Transparency, Reproducibility, and Ethics by Design
Yatin Bhatia Best Practices in Machine Learning Observability
Yuan Tang Towards Cloud-Native Distributed Machine Learning Pipelines at Scale (pre-recorded)
Zachary Blackwood Automatic Pattern-Based Data Catalogs, Lightning Talks
Zayd Ma Packaging PyData for Enterprise Software Supply Chain (pre-recording)