» Articles » PMID: 37543596

Development and Validation of Predictive Models for Unplanned Hospitalization in the Basque Country: Analyzing the Variability of Non-deterministic Algorithms

Overview
Publisher Biomed Central
Date 2023 Aug 5
PMID 37543596
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The progressive ageing in developed countries entails an increase in multimorbidity. Population-wide predictive models for adverse health outcomes are crucial to address these growing healthcare needs. The main objective of this study is to develop and validate a population-based prognostic model to predict the probability of unplanned hospitalization in the Basque Country, through comparing the performance of a logistic regression model and three families of machine learning models.

Methods: Using age, sex, diagnoses and drug prescriptions previously transformed by the Johns Hopkins Adjusted Clinical Groups (ACG) System, we predict the probability of unplanned hospitalization in the Basque Country (2.2 million inhabitants) using several techniques. When dealing with non-deterministic algorithms, comparing a single model per technique is not enough to choose the best approach. Thus, we conduct 40 experiments per family of models - Random Forest, Gradient Boosting Decision Trees and Multilayer Perceptrons - and compare them to Logistic Regression. Models' performance are compared both population-wide and for the 20,000 patients with the highest predicted probabilities, as a hypothetical high-risk group to intervene on.

Results: The best-performing technique is Multilayer Perceptron, followed by Gradient Boosting Decision Trees, Logistic Regression and Random Forest. Multilayer Perceptrons also have the lowest variability, around an order of magnitude less than Random Forests. Median area under the ROC curve, average precision and positive predictive value range from 0.789 to 0.802, 0.237 to 0.257 and 0.485 to 0.511, respectively. For Brier Score the median values are 0.048 for all techniques. There is some overlap between the algorithms. For instance, Gradient Boosting Decision Trees perform better than Logistic Regression more than 75% of the time, but not always.

Conclusions: All models have good global performance. The only family that is consistently superior to Logistic Regression is Multilayer Perceptron, showing a very reliable performance with the lowest variability.

Citing Articles

Interpretable machine learning for predicting sepsis risk in emergency triage patients.

Liu Z, Shu W, Li T, Zhang X, Chong W Sci Rep. 2025; 15(1):887.

PMID: 39762406 PMC: 11704257. DOI: 10.1038/s41598-025-85121-z.


Validity of the Johns Hopkins Adjusted Clinical Groups system on the utilisation of healthcare services in Norway: a retrospective cross-sectional study.

Hosar R, Berntsen G, Steinsbekk A BMC Health Serv Res. 2024; 24(1):1279.

PMID: 39448990 PMC: 11515438. DOI: 10.1186/s12913-024-11715-4.


Using machine learning methods to predict all-cause somatic hospitalizations in adults: A systematic review.

Askar M, Tafavvoghi M, Smabrekke L, Bongo L, Svendsen K PLoS One. 2024; 19(8):e0309175.

PMID: 39178283 PMC: 11343463. DOI: 10.1371/journal.pone.0309175.

References
1.
Riis A, Kristensen P, Lauritsen S, Thiesson B, Johansson Jorgensen M . Using Explainable Artificial Intelligence to Predict Potentially Preventable Hospitalizations: A Population-Based Cohort Study in Denmark. Med Care. 2023; 61(4):226-236. PMC: 10377250. DOI: 10.1097/MLR.0000000000001830. View

2.
Wang L, Porter B, Maynard C, Evans G, Bryson C, Sun H . Predicting risk of hospitalization or death among patients receiving primary care in the Veterans Health Administration. Med Care. 2012; 51(4):368-73. DOI: 10.1097/MLR.0b013e31827da95a. View

3.
Henderson M, Han F, Perman C, Haft H, Stockwell I . Predicting avoidable hospital events in Maryland. Health Serv Res. 2021; 57(1):192-199. PMC: 8763284. DOI: 10.1111/1475-6773.13891. View

4.
Mateo-Abad M, Gonzalez N, Fullaondo A, Merino M, Azkargorta L, Gine A . Impact of the CareWell integrated care model for older patients with multimorbidity: a quasi-experimental controlled study in the Basque Country. BMC Health Serv Res. 2020; 20(1):613. PMC: 7333301. DOI: 10.1186/s12913-020-05473-2. View

5.
Girwar S, Jabroer R, Fiocco M, Sutch S, Numans M, Bruijnzeels M . A systematic review of risk stratification tools internationally used in primary care settings. Health Sci Rep. 2021; 4(3):e329. PMC: 8299990. DOI: 10.1002/hsr2.329. View