» Articles » PMID: 26920363

Comparison of Predictive Modeling Approaches for 30-day All-cause Non-elective Readmission Risk

Overview
Publisher Biomed Central
Date 2016 Feb 28
PMID 26920363
Citations 28
Authors
Affiliations
Soon will be listed here.
Abstract

Background: This paper explores the importance of electronic medical records (EMR) for predicting 30-day all-cause non-elective readmission risk of patients and presents a comparison of prediction performance of commonly used methods.

Methods: The data are extracted from eight Advocate Health Care hospitals. Index admissions are excluded from the cohort if they are observation, inpatient admissions for psychiatry, skilled nursing, hospice, rehabilitation, maternal and newborn visits, or if the patient expires during the index admission. Data are randomly and repeatedly divided into fitting and validating sets for cross validations. Approaches including LACE, STEPWISE logistic, LASSO logistic, and AdaBoost, are compared with sample sizes varying from 2,500 to 80,000.

Results: Our results confirm that LACE has moderate discrimination power with the area under receiver operating characteristic curve (AUC) around 0.65-0.66, which can be improved to 0.73-0.74 when additional variables from EMR are considered. These variables include Inpatient in the last six months, Number of emergency room visits or inpatients in the last year, Braden score, Polypharmacy, Employment status, Discharge disposition, Albumin level, and medical condition variables such as Leukemia, Malignancy, Renal failure with hemodialysis, History of alcohol substance abuse, Dementia and Trauma. When sample size is small (≤5000), LASSO is the best; when sample size is large (≥20,000), the predictive performance is similar. The STEPWISE method has a slightly lower AUC (0.734) comparing to LASSO (0.737) and AdaBoost (0.737). More than one half of the selected predictors can be false positives when using a single method and a single division of fitting/validating data.

Conclusions: True predictors can be identified by repeatedly dividing data into fitting/validating subsets and referring the final model based on summarizing results. LASSO is a better alternative to the STEPWISE logistic regression, especially when sample size is not large. The evidence for adequate sample size can be explored by fitting models on gradually reduced samples. Our model comparison strategy is not only good for 30-day all-cause non-elective readmission risk predictions, but also applicable to other types of predictive models in clinical studies.

Citing Articles

Comparison of two modeling approaches for the identification of predictors of complications in children with cerebral palsy following spine surgery.

Difazio R, Strout T, Vessey J, Berry J, Whitney D BMC Med Res Methodol. 2024; 24(1):236.

PMID: 39394575 PMC: 11468503. DOI: 10.1186/s12874-024-02360-w.


Comparison of machine-learning and logistic regression models for prediction of 30-day unplanned readmission in electronic health records: A development and validation study.

Iwagami M, Inokuchi R, Kawakami E, Yamada T, Goto A, Kuno T PLOS Digit Health. 2024; 3(8):e0000578.

PMID: 39163277 PMC: 11335098. DOI: 10.1371/journal.pdig.0000578.


Using Machine Learning Models to Identify Factors Associated With 30-Day Readmissions After Posterior Cervical Fusions: A Longitudinal Cohort Study.

Gonzalez-Suarez A, Rezaii P, Herrick D, Tigchelaar S, Ratliff J, Rusu M Neurospine. 2024; 21(2):620-632.

PMID: 38768945 PMC: 11224744. DOI: 10.14245/ns.2347340.670.


Social and Behavioral Determinants of Health in the Era of Artificial Intelligence with Electronic Health Records: A Scoping Review.

Bompelli A, Wang Y, Wan R, Singh E, Zhou Y, Xu L Health Data Sci. 2024; 2021:9759016.

PMID: 38487504 PMC: 10880156. DOI: 10.34133/2021/9759016.


Development and Validation of a Prediction Model on Adult Emergency Department Patients for Early Identification of Fulminant Myocarditis.

Jiang M, Ke J, Fang M, Huang S, Li Y Curr Med Sci. 2023; 43(5):961-969.

PMID: 37450071 DOI: 10.1007/s11596-023-2768-8.


References
1.
Friedman J, Hastie T, Tibshirani R . Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010; 33(1):1-22. PMC: 2929880. View

2.
Cook N . Use and misuse of the receiver operating characteristic curve in risk prediction. Circulation. 2007; 115(7):928-35. DOI: 10.1161/CIRCULATIONAHA.106.672402. View

3.
Choudhry S, Li J, Davis D, Erdmann C, Sikka R, Sutariya B . A public-private partnership develops and externally validates a 30-day hospital readmission risk prediction model. Online J Public Health Inform. 2013; 5(2):219. PMC: 3812998. DOI: 10.5210/ojphi.v5i2.4726. View

4.
Harrell Jr F, Lee K, Mark D . Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996; 15(4):361-87. DOI: 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4. View

5.
Kind A, Jencks S, Brock J, Yu M, Bartels C, Ehlenbach W . Neighborhood socioeconomic disadvantage and 30-day rehospitalization: a retrospective cohort study. Ann Intern Med. 2014; 161(11):765-74. PMC: 4251560. DOI: 10.7326/M13-2946. View