Generalizability of an Acute Kidney Injury Prediction Model Across Health Systems

Overview

Journal Nat Mach Intell

Publisher Springer Nature

Specialty Biomedical Engineering

Date 2023 Dec 27

PMID 38148789

Authors

Jie Cao

Xiaosong Zhang

Vahakn Shahinian

Huiying Yin

Diane Steffick

Rajiv Saran

Susan Crowley

Michael Mathis

Girish N Nadkarni

Michael Heung

Karandeep Singh

Affiliations

Soon will be listed here.

Abstract

Delays in the identification of acute kidney injury (AKI) in hospitalized patients are a major barrier to the development of effective interventions to treat AKI. A recent study by Tomasev and colleagues at DeepMind described a model that achieved a state-of-the-art performance in predicting AKI up to 48 hours in advance. Because this model was trained in a population of US Veterans that was 94% male, questions have arisen about its reproducibility and generalizability. In this study, we aimed to reproduce key aspects of this model, trained and evaluated it in a similar population of US Veterans, and evaluated its generalizability in a large academic hospital setting. We found that the model performed worse in predicting AKI in females in both populations, with miscalibration in lower stages of AKI and worse discrimination (a lower area under the curve) in higher stages of AKI. We demonstrate that while this discrepancy in performance can be largely corrected in non-Veterans by updating the original model using data from a sex-balanced academic hospital cohort, the worse model performance persists in Veterans. Our study sheds light on the importance of reproducing artificial intelligence studies, and on the complexity of discrepancies in model performance in subgroups that cannot be explained simply on the basis of sample size.

Citing Articles

A Review of Leveraging Artificial Intelligence to Predict Persistent Postoperative Opioid Use and Opioid Use Disorder and its Ethical Considerations.

Gabriel R, Park B, Hsu C, Macias A Curr Pain Headache Rep. 2025; 29(1):30.

PMID: 39847176 PMC: 11758157. DOI: 10.1007/s11916-024-01319-2.

Performance of Machine Learning Suicide Risk Models in an American Indian Population.

Haroz E, Rebman P, Goklish N, Garcia M, Suttle R, Maggio D JAMA Netw Open. 2024; 7(10):e2439269.

PMID: 39401036 PMC: 11474420. DOI: 10.1001/jamanetworkopen.2024.39269.

Development and Validation of a Machine Learning COVID-19 Veteran (COVet) Deterioration Risk Score.

Govindan S, Spicer A, Bearce M, Schaefer R, Uhl A, Alterovitz G Crit Care Explor. 2024; 6(7):e1116.

PMID: 39028867 PMC: 11262818. DOI: 10.1097/CCE.0000000000001116.

An empirical study on KDIGO-defined acute kidney injury prediction in the intensive care unit.

Lyu X, Fan B, Huser M, Hartout P, Gumbsch T, Faltys M Bioinformatics. 2024; 40(Suppl 1):i247-i256.

PMID: 38940165 PMC: 11211814. DOI: 10.1093/bioinformatics/btae212.

Transforming the cardiometabolic disease landscape: Multimodal AI-powered approaches in prevention and management.

Muse E, Topol E Cell Metab. 2024; 36(4):670-683.

PMID: 38428435 PMC: 10990799. DOI: 10.1016/j.cmet.2024.02.002.

References

Wilson F, Shashaty M, Testani J, Aqeel I, Borovskiy Y, Ellenberg S . Automated, electronic alerts for acute kidney injury: a single-blind, parallel-group, randomised controlled trial. Lancet. 2015; 385(9981):1966-74. PMC: 4475457. DOI: 10.1016/S0140-6736(15)60266-5. View

McDermott M, Wang S, Marinsek N, Ranganath R, Foschini L, Ghassemi M . Reproducibility in machine learning for health research: Still a ways to go. Sci Transl Med. 2021; 13(586). DOI: 10.1126/scitranslmed.abb1655. View

Hoste E, Kellum J, Selby N, Zarbock A, Palevsky P, Bagshaw S . Global epidemiology and outcomes of acute kidney injury. Nat Rev Nephrol. 2018; 14(10):607-625. DOI: 10.1038/s41581-018-0052-0. View

Peng J, Wu T, Wu X, Yan P, Kang Y, Liu Y . Development of mortality prediction model in the elderly hospitalized AKI patients. Sci Rep. 2021; 11(1):15157. PMC: 8313696. DOI: 10.1038/s41598-021-94271-9. View

Wong A, Otles E, Donnelly J, Krumm A, McCullough J, DeTroyer-Cooley O . External Validation of a Widely Implemented Proprietary Sepsis Prediction Model in Hospitalized Patients. JAMA Intern Med. 2021; 181(8):1065-1070. PMC: 8218233. DOI: 10.1001/jamainternmed.2021.2626. View

Tomasev N, Harris N, Baur S, Mottram A, Glorot X, Rae J . Use of deep learning to develop continuous-risk models for adverse event prediction from electronic health records. Nat Protoc. 2021; 16(6):2765-2787. DOI: 10.1038/s41596-021-00513-5. View

Tomasev N, Glorot X, Rae J, Zielinski M, Askham H, Saraiva A . A clinically applicable approach to continuous prediction of future acute kidney injury. Nature. 2019; 572(7767):116-119. PMC: 6722431. DOI: 10.1038/s41586-019-1390-1. View

Singh K, Beam A, Nallamothu B . Machine Learning in Clinical Journals: Moving From Inscrutable to Informative. Circ Cardiovasc Qual Outcomes. 2020; 13(10):e007491. PMC: 9126253. DOI: 10.1161/CIRCOUTCOMES.120.007491. View

Koyner J, Adhikari R, Edelson D, Churpek M . Development of a Multicenter Ward-Based AKI Prediction Model. Clin J Am Soc Nephrol. 2016; 11(11):1935-1943. PMC: 5108182. DOI: 10.2215/CJN.00280116. View

10.

Koyner J, Carey K, Edelson D, Churpek M . The Development of a Machine Learning Inpatient Acute Kidney Injury Prediction Model. Crit Care Med. 2018; 46(7):1070-1077. DOI: 10.1097/CCM.0000000000003123. View

11.

Carter R, Attia Z, Lopez-Jimenez F, Friedman P . Pragmatic considerations for fostering reproducible research in artificial intelligence. NPJ Digit Med. 2019; 2:42. PMC: 6550149. DOI: 10.1038/s41746-019-0120-2. View

12.

Motwani S, McMahon G, Humphreys B, Partridge A, Waikar S, Curhan G . Development and Validation of a Risk Prediction Model for Acute Kidney Injury After the First Course of Cisplatin. J Clin Oncol. 2018; 36(7):682-688. PMC: 5946720. DOI: 10.1200/JCO.2017.75.7161. View

13.

Khwaja A . KDIGO clinical practice guidelines for acute kidney injury. Nephron Clin Pract. 2012; 120(4):c179-84. DOI: 10.1159/000339789. View

14.

Stupple A, Singerman D, Celi L . The reproducibility crisis in the age of digital medicine. NPJ Digit Med. 2019; 2:2. PMC: 6550262. DOI: 10.1038/s41746-019-0079-z. View

15.

McCradden M, Stephenson E, Anderson J . Clinical research underlies ethical integration of healthcare artificial intelligence. Nat Med. 2020; 26(9):1325-1326. DOI: 10.1038/s41591-020-1035-9. View

16.

Larrazabal A, Nieto N, Peterson V, Milone D, Ferrante E . Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc Natl Acad Sci U S A. 2020; 117(23):12592-12594. PMC: 7293650. DOI: 10.1073/pnas.1919012117. View

17.

Haines R, Lin S, Hewson R, Kirwan C, Torrance H, ODwyer M . Acute Kidney Injury in Trauma Patients Admitted to Critical Care: Development and Validation of a Diagnostic Prediction Model. Sci Rep. 2018; 8(1):3665. PMC: 5827665. DOI: 10.1038/s41598-018-21929-2. View

18.

Haibe-Kains B, Adam G, Hosny A, Khodakarami F, Waldron L, Wang B . Transparency and reproducibility in artificial intelligence. Nature. 2020; 586(7829):E14-E16. PMC: 8144864. DOI: 10.1038/s41586-020-2766-y. View

19.

Singh K, Valley T, Tang S, Li B, Kamran F, Sjoding M . Evaluating a Widely Implemented Proprietary Deterioration Index Model among Hospitalized Patients with COVID-19. Ann Am Thorac Soc. 2020; 18(7):1129-1137. PMC: 8328366. DOI: 10.1513/AnnalsATS.202006-698OC. View

20.

DeLong E, Delong D, Clarke-Pearson D . Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988; 44(3):837-45. View