AI in healthcare is getting towards the production stage. How can we responsibly implement when in real-time data is not as curated as during training? We show tools built for a ML based product running live in the ICU, which ensure that the model is only applied when it can be trusted, data problems are caught in real-time, and changes in model outcomes are explainable in retrospect.
An increasing amount of AI-driven software products is being productionised in hospitals and used by doctors to make critical decisions. To ensure the responsible implementation of ML models, it is vital that models give adequate outputs not only in the curated training environment, but also in the messy world of real-time data.
This is especially the case in health care, where hundreds of different vital signs, laboratory values, medications and patient demographics may contribute to ML predictions. Changes in data registration processes, manual data entry errors and large changes in patient population or treatment policies happen regularly and unexpectedly.
What can we do to ensure responsible implementation of ML given these challenges? Pacmed Critical is one of the first examples where AI is used in production in Dutch hospitals. The software supports intensive care doctors, by predicting whether patients can be safely discharged from the ICU. In this talk we present the three-fold way Pacmed Critical ensures responsible predictions in production:
When (not) to predict? How the ML model avoids giving predictions when it should not be trusted, using both business rules and out-of-domain detection
How to ensure data validity? How ML monitoring enables automatic continuous monitoring of real-time data problems and data drift
Why did that prediction change? How a dashboard which visualises the change of Shapley values during a patients' admission, to interpret and explain the causes of prediction changes for an individual patient.