Saturday 10:15 AM–11:00 AM in Fairness in AI - Room 100D/E

Right Code, Right Place, Right Time

Tim Hopper

Audience level:
Intermediate

Description

Many of the most important problems a data scientist faces day-to-day are software engineering problem. Among them are the challenge of ensuring the right code runs in the right place at the right time. I discuss the importance of having solutions to this challenge and some possible implementations.

Abstract

  1. Engineering is the hardest problem of data science
  2. Problems with running
    1. Not the right code
      1. Wrong dependencies
      2. Wrong versions
      3. Wrong configuration
    2. Not the right place
      1. Wrong deployment
      2. Wrong network configurations (e.g. accessing external services)
    3. Not the right time
      1. Not following schedule
      2. Not triggered automatically
  3. Some solutions
    1. Right code
      1. Version Hashes
      2. Check sums
      3. Semantic versioning
      4. Binary repository
    2. Right place
      1. Reproducible infrastructure and configuration
      2. Containers and container orchestration
    3. Right time
      1. Job scheduling
      2. Visibility Monitoring

Subscribe to Receive PyData Updates

Subscribe