Sunday 2:00 PM–2:45 PM in Speakeasy

Mental Models to Use and Avoid as a Data Scientist

Jonathan Whitmore, Christian Perez

Audience level:
Intermediate

Description

Using Jupyter Notebooks and Python code, we will present several data-driven examples of some simple, powerful, yet relatively uncommon, ways of thinking as a good Data Scientist. We will also warn about a few dangerous ways of thinking to avoid. Our Jupyter Notebooks and slides will be made freely available after the talk.

Abstract

The first principle is that you must not fool yourself and you are the easiest person to fool. -- Richard Feynman

A scientific mindset is a powerful force for producing knowledge: from academic fields, to business and public policy. The combination of statistical rigor, an experimental mindset and the drive to ask the right questions produces crucial input to decision making.

We will illustrate several principles of scientific thinking that should be more widespread. Each principle will be demonstrated with a specific example that will help it stick with the audience after they leave.

A few key principles, attitudes, and techniques for Data Science:

  • "I am easy to fool" -- acknowledge confirmation and experimenter bias
  • "Everyone knows what it takes to change my mind" -- culture of critical thinking, avoid congruence bias (test alternative hypotheses)
  • "A good explanation can be proven wrong" -- falsifiability
  • "The strength of my belief is proportional to the strength of the evidence" -- degrees of plausibility
  • Blinding analyses -- not letting expectation of result drive model revision
  • Answering the question: what would it look like if your hypothesis were wrong for a hypothetical case?

Mindsets to avoid:

  • Just-so storytelling is a tempting anti-pattern -- creating story to explain the data after knowing the answer.
  • "The correct explanation must be easy for me to understand" -- availability or belief bias.