We dig into the mechanics of running and understanding A/B tests, including a practical demo of Jacquard – Thread's open-source tooling – then go into some of the dangerous pitfalls that ensue.
A/B testing is one of the most powerful tools in the e-commerce arsenal. The data-driven approach which it enables played a strong part in making Amazon and Google successful. Arguably, A/B testing even led to the election of Barack Obama. There have been many (generally self-righteous) blog posts extolling its virtues.
In this talk I'll instead dig into the actual mechanics of how we run and understand A/B tests at Thread. I'll introduce our open-source A/B testing tool, Jacquard, and its design, with a practical demo. I'll also dig into how we interpret and understand the information that comes from our tests. We use Bayesian statistics and present our results in an unorthodox way.
I'll also explore some of the risks of testing. There are some tricky pitfalls and dangers to be aware of. There's also the many ways to invalidate the results of a test, and a huge number of biases to be aware of. I will then add some dire-sounding warnings about trusting A/B tests too much.