An introduction to applying TDD in a Data world. Taking the experience of traditional TDD from a Web Development background and translating it into useful techniques for Data Scientists. Hopefully by the end of this talk TDD will be far less of a buzzword and you'll enjoy applying it more yourself!
Beginning by introducing how I learnt TDD (Test Driven Development) from Web Development. I’ll walk through how the traditional way of using TDD sometimes doesn’t apply to Data Science. By exploring examples of data-oriented TDD, I’ll show that even though TDD is a rigorous practice, it can also be fun. And how TDD can provide you with more space to explore how to build software.
Building upon the data testing, we’ll look at how to apply TDD to machine learning models and why it’s tricky to build deterministic tests for them. Then we’ll bring it altogether with pipeline testing, how it can be difficult and ways to create tests that proxy it. I'll also cover when not to use TDD, as that can ruin the fun.
By the end of the talk, you should have ideas for implementing TDD in your workflow with data. And ways to convince people to play along with you!