In this presentation, we are going to talk about how we tackle the problem of forecasting time series data for our Real Time Dynamic Pricing service and the challenges we have faced putting into production a real time streaming ML service while adopting an MLOps lifecycle.
At Beat we employ deep learning to predict vehicle requests and rides for a given geographic region at a given time. These predictions inform the decisions we make on the fare of the ride and help us to balance demand and supply in the markets where we operate by incentivising drivers. Over the last couple of years deep learning has dominated time series prediction tasks, with a set of LSTM and CNN based architectures being the state of the art when it comes to forecasting performance. Deploying deep learning models for time series prediction poses a lot of challenges in data preprocessing. In particular, concerns regarding transforming our data into a supervised learning problem, dealing with seasonality and trend and performing scaling on numerical features arise. Moreover, productionizing our parallel LSTM model to a real-time ML service posed new challenges regarding testing, continuous integration and deployment in a setting of multiple deployments for different countries/cities. We will discuss how we tackled these using technologies such as Kubernetes and Kubeflow.