This is a talk about applying data science in the insurance industry. But not in a small/feisty/Lemonade-style insurtech: QBE is over a hundred years old and writes billions in premiums every year. Size/legacy means that there is a lot of opportunity – but this can also be a source of frustration. I'll give a broad intro to what we do, and dig a bit deeper into a couple key issues.
This talk aims to introduce data science in the insurance industry, and provide a bit of flavour in terms of the problems our team is working on. The target audience is data scientists/engineers with at least basic machine learning knowledge. No insurance knowledge will be assumed. The goal is for attendees to learn a bit about insurance, and (with any luck) pick up some techniques and ideas that they can apply to their own work.
Synopsis
The insurance business model heavily relies on data, and much of it revolves around predicting future events – seems like a natural fit for data scientists! But corporate insurance is a unique world, and there are lots of things we need to figure out, e.g.:
Claims distributions are funky and long-tailed: 90% of policies may be claim-free, but one or two might have claims well over a million pounds. Model building and evaluation can be tricky.
QBE sells insurance to companies, not individuals. This could mean cargo vessels, large properties, or a fleet of 300 trucks. This also means that relationships and customised contracts are really important, and final decision makers will nearly always be human – so model outputs have to be designed with this in mind.
I’ll also cover some issues related to working in a large corporate with legacy systems. We’re solving these by building our own infrastructure internally, and hooking up to external data wherever we can.