Towards Trustworthy AI in Healthcare: Epistemic Uncertainty Estimation for Clinical Decision Support

Overview

Journal J Pers Med

Date 2025 Feb 25

PMID 39997335

Authors

Adrian Lindenmeyer

Malte Blattmann

Stefan Franke

Thomas Neumuth

Daniel Schneider

Affiliations

Soon will be listed here.

Abstract

Widespread adoption of AI for medical decision-making is still hindered due to ethical and safety-related concerns. For AI-based decision support systems in healthcare settings, it is paramount to be reliable and trustworthy. Common deep learning approaches, however, have the tendency towards overconfidence when faced with unfamiliar or changing conditions. Inappropriate extrapolation beyond well-supported scenarios may have dire consequences highlighting the importance of the reliable estimation of local knowledge uncertainty and its communication to the end user. While neural network ensembles (ENNs) have been heralded as a potential solution to these issues for many years, deep learning methods, specifically modeling the amount of knowledge, promise more principled and reliable behavior. This study compares their reliability in clinical applications. We centered our analysis on experiments with low-dimensional toy datasets and the exemplary case study of mortality prediction for intensive care unit hospitalizations using Electronic Health Records (EHRs) from the MIMIC3 study. For predictions on the EHR time series, Encoder-Only Transformer models were employed. Knowledge uncertainty estimation is achieved with both ensemble and Spectral Normalized Neural Gaussian Process (SNGP) variants of the common Transformer model. We designed two datasets to test their reliability in detecting token level and more subtle discrepancies both for toy datasets and an EHR dataset. While both SNGP and ENN model variants achieve similar prediction performance (AUROC: ≈0.85, AUPRC: ≈0.52 for in-hospital mortality prediction from a selected MIMIC3 benchmark), the former demonstrates improved capabilities to quantify knowledge uncertainty for individual samples/patients. Methods including a knowledge model, such as SNGP, offer superior uncertainty estimation compared to traditional stochastic deep learning, leading to more trustworthy and safe clinical decision support.

References

Si Y, Du J, Li Z, Jiang X, Miller T, Wang F . Deep representation learning of patient data from Electronic Health Records (EHR): A systematic review. J Biomed Inform. 2021; 115:103671. PMC: 11290708. DOI: 10.1016/j.jbi.2020.103671. View

Liang J, Li Y, Zhang Z, Shen D, Xu J, Zheng X . Adoption of Electronic Health Records (EHRs) in China During the Past 10 Years: Consecutive Survey Data Analysis and Comparison of Sino-American Challenges and Experiences. J Med Internet Res. 2021; 23(2):e24813. PMC: 7932845. DOI: 10.2196/24813. View

Ristevski B, Chen M . Big Data Analytics in Medicine and Healthcare. J Integr Bioinform. 2018; 15(3). PMC: 6340124. DOI: 10.1515/jib-2017-0030. View

Shickel B, Tighe P, Bihorac A, Rashidi P . Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis. IEEE J Biomed Health Inform. 2018; 22(5):1589-1604. PMC: 6043423. DOI: 10.1109/JBHI.2017.2767063. View

Raghupathi W, Raghupathi V . Big data analytics in healthcare: promise and potential. Health Inf Sci Syst. 2015; 2:3. PMC: 4341817. DOI: 10.1186/2047-2501-2-3. View

Menachemi N, Collum T . Benefits and drawbacks of electronic health record systems. Risk Manag Healthc Policy. 2012; 4:47-55. PMC: 3270933. DOI: 10.2147/RMHP.S12985. View

Rasmy L, Xiang Y, Xie Z, Tao C, Zhi D . Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ Digit Med. 2021; 4(1):86. PMC: 8137882. DOI: 10.1038/s41746-021-00455-y. View

Darabi S, Kachuee M, Fazeli S, Sarrafzadeh M . TAPER: Time-Aware Patient EHR Representation. IEEE J Biomed Health Inform. 2020; 24(11):3268-3275. DOI: 10.1109/JBHI.2020.2984931. View

Johnson A, Pollard T, Shen L, Lehman L, Feng M, Ghassemi M . MIMIC-III, a freely accessible critical care database. Sci Data. 2016; 3:160035. PMC: 4878278. DOI: 10.1038/sdata.2016.35. View

10.

Wu P, Cheng C, Kaddi C, Venugopalan J, Hoffman R, Wang M . -Omic and Electronic Health Record Big Data Analytics for Precision Medicine. IEEE Trans Biomed Eng. 2016; 64(2):263-273. PMC: 5859562. DOI: 10.1109/TBME.2016.2573285. View

11.

Kompa B, Snoek J, Beam A . Second opinion needed: communicating uncertainty in medical machine learning. NPJ Digit Med. 2021; 4(1):4. PMC: 7785732. DOI: 10.1038/s41746-020-00367-3. View

12.

Harutyunyan H, Khachatrian H, Kale D, Steeg G, Galstyan A . Multitask learning and benchmarking with clinical time series data. Sci Data. 2019; 6(1):96. PMC: 6572845. DOI: 10.1038/s41597-019-0103-9. View

13.

Norori N, Hu Q, Aellen F, Faraci F, Tzovara A . Addressing bias in big data and AI for health care: A call for open science. Patterns (N Y). 2021; 2(10):100347. PMC: 8515002. DOI: 10.1016/j.patter.2021.100347. View

14.

Choi E, Bahadori M, Schuetz A, Stewart W, Sun J . Doctor AI: Predicting Clinical Events via Recurrent Neural Networks. JMLR Workshop Conf Proc. 2017; 56:301-318. PMC: 5341604. View

15.

Goldberger A, Amaral L, Glass L, Hausdorff J, Ivanov P, Mark R . PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation. 2000; 101(23):E215-20. DOI: 10.1161/01.cir.101.23.e215. View

16.

Wu H, Shi W, Choudhary A, Wang M . Clinical decision making under uncertainty: a bootstrapped counterfactual inference approach. BMC Med Inform Decis Mak. 2024; 24(1):275. PMC: 11437925. DOI: 10.1186/s12911-024-02606-z. View

17.

Li Y, Rao S, Ayala Solares J, Hassaine A, Ramakrishnan R, Canoy D . BEHRT: Transformer for Electronic Health Records. Sci Rep. 2020; 10(1):7155. PMC: 7189231. DOI: 10.1038/s41598-020-62922-y. View

18.

Challen R, Denny J, Pitt M, Gompels L, Edwards T, Tsaneva-Atanasova K . Artificial intelligence, bias and clinical safety. BMJ Qual Saf. 2019; 28(3):231-237. PMC: 6560460. DOI: 10.1136/bmjqs-2018-008370. View

19.

Popat R, Ive J . Embracing the uncertainty in human-machine collaboration to support clinical decision-making for mental health conditions. Front Digit Health. 2023; 5:1188338. PMC: 10508184. DOI: 10.3389/fdgth.2023.1188338. View

20.

Alanazi A . Clinicians' Views on Using Artificial Intelligence in Healthcare: Opportunities, Challenges, and Beyond. Cureus. 2023; 15(9):e45255. PMC: 10576621. DOI: 10.7759/cureus.45255. View