Saturday 2:15 PM–3:00 PM in Fairness in AI - Room 100D/E

Machine Learning in the age of increasing Data Privacy Consciousness

Sam Talasila

Audience level:
Novice

Description

In the age of GDPR and its related lawsuits, the prevailing strategy of being indiscriminately “data greedy” to power opaque machine learning (ML) based products is not viable. In this talk, I’ll examine how ML and Data Science practitioners can balance principles of “data minimization & transparency” with delivering personalized and insightful products to their privacy conscious users.

Abstract

Legislation such as GDPR and proposals like California Consumer Privacy Act (CCPA) of 2018 are strong judicial signals regarding the growing consciousness around collection and use of personal data. In this environment the need to practice principles of “data minimization”, where only what is needed is collected through legal means, as well as “transparency” around how and for what purpose the collected data is being used, both become the de facto modus operandi of any practitioner of DS and ML.

In this talk, I begin by describing the principles embodied by laws such as GDPR and CCPA and the compliance burdens they introduce. Later, borrowing from my experience at Shopify using Python in a distributed setting, I will discuss what practical changes to pipelines and algorithm choices Data Science and ML practitioners should think about. These lessons will be tied back to the privacy by design paradigm which enables the development of products that value trust and transparency.

By the end of this talk, you should have a clear idea of how laws such as GDPR and CCPA affect your workflows and what principles to consider while designing ML products with privacy in mind.

Subscribe to Receive PyData Updates

Subscribe