Saturday 2:15 PM–3:00 PM in Room 2

Data Engineering Architecture at Simple

Rob Story

Audience level:
Intermediate

Description

A walk through Simple's Data Engineering stack, including lessons learned and why we chose certain tools and languages for different parts of our infrastructure.

Abstract

Simple's Data Engineering team has spent the past year and a half building data pipelines to enable the customer service, marketing, finance, and leadership teams to make data-driven decisions.

We'll walk through why the data team chose certain open source tools, including Kafka, RabbitMQ, Postgres, Celery, and Elasticsearch. We'll also discuss the advantages to using Amazon Redshift for data warehousing and some of the lessons learned operating it in production. For each of these tools we'll cover some of the best Python libraries to use for interacting with them.

Finally, we'll touch on the team's choices for the languages used in the data engineering stack, including Scala, Java, and Clojure in addition to Python.