BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//pretalx//global2024.pydata.org//YXUSUU
BEGIN:VEVENT
UID:pretalx-cfp-YXUSUU@global2024.pydata.org
DTSTART:20241204T163000Z
DTEND:20241204T170000Z
DESCRIPTION:Data deduplication is a ubiquitous data quality problem that mo
 st data people will encounter at some point in their career.  It happens w
 henever multiple records are collected about the same person or other enti
 ty without a unique identifier that ties these records together.  \n\nThis
  talk provides beginners with everything they need to start linking and de
 duping large datasets using [Splink](https://github.com/moj-analytical-ser
 vices/splink)\, a free Python library.
DTSTAMP:20250709T220238Z
LOCATION:Data/ Data Science Track
SUMMARY:Rapid deduplication and fuzzy matching of large datasets using Spli
 nk - Robin Linacre
URL:https://global2024.pydata.org/cfp/talk/YXUSUU/
END:VEVENT
END:VCALENDAR