Generalizability of an Acute Kidney Injury Prediction Model Across Health Systems
Overview
Authors
Affiliations
Delays in the identification of acute kidney injury (AKI) in hospitalized patients are a major barrier to the development of effective interventions to treat AKI. A recent study by Tomasev and colleagues at DeepMind described a model that achieved a state-of-the-art performance in predicting AKI up to 48 hours in advance. Because this model was trained in a population of US Veterans that was 94% male, questions have arisen about its reproducibility and generalizability. In this study, we aimed to reproduce key aspects of this model, trained and evaluated it in a similar population of US Veterans, and evaluated its generalizability in a large academic hospital setting. We found that the model performed worse in predicting AKI in females in both populations, with miscalibration in lower stages of AKI and worse discrimination (a lower area under the curve) in higher stages of AKI. We demonstrate that while this discrepancy in performance can be largely corrected in non-Veterans by updating the original model using data from a sex-balanced academic hospital cohort, the worse model performance persists in Veterans. Our study sheds light on the importance of reproducing artificial intelligence studies, and on the complexity of discrepancies in model performance in subgroups that cannot be explained simply on the basis of sample size.
Gabriel R, Park B, Hsu C, Macias A Curr Pain Headache Rep. 2025; 29(1):30.
PMID: 39847176 PMC: 11758157. DOI: 10.1007/s11916-024-01319-2.
Performance of Machine Learning Suicide Risk Models in an American Indian Population.
Haroz E, Rebman P, Goklish N, Garcia M, Suttle R, Maggio D JAMA Netw Open. 2024; 7(10):e2439269.
PMID: 39401036 PMC: 11474420. DOI: 10.1001/jamanetworkopen.2024.39269.
Development and Validation of a Machine Learning COVID-19 Veteran (COVet) Deterioration Risk Score.
Govindan S, Spicer A, Bearce M, Schaefer R, Uhl A, Alterovitz G Crit Care Explor. 2024; 6(7):e1116.
PMID: 39028867 PMC: 11262818. DOI: 10.1097/CCE.0000000000001116.
An empirical study on KDIGO-defined acute kidney injury prediction in the intensive care unit.
Lyu X, Fan B, Huser M, Hartout P, Gumbsch T, Faltys M Bioinformatics. 2024; 40(Suppl 1):i247-i256.
PMID: 38940165 PMC: 11211814. DOI: 10.1093/bioinformatics/btae212.
Muse E, Topol E Cell Metab. 2024; 36(4):670-683.
PMID: 38428435 PMC: 10990799. DOI: 10.1016/j.cmet.2024.02.002.