» Articles » PMID: 24886637

A Three-step Approach for the Derivation and Validation of High-performing Predictive Models Using an Operational Dataset: Congestive Heart Failure Readmission Case Study

Overview
Publisher Biomed Central
Date 2014 Jun 3
PMID 24886637
Citations 6
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The aim of this study was to propose an analytical approach to develop high-performing predictive models for congestive heart failure (CHF) readmission using an operational dataset with incomplete records and changing data over time.

Methods: Our analytical approach involves three steps: pre-processing, systematic model development, and risk factor analysis. For pre-processing, variables that were absent in >50% of records were removed. Moreover, the dataset was divided into a validation dataset and derivation datasets which were separated into three temporal subsets based on changes to the data over time. For systematic model development, using the different temporal datasets and the remaining explanatory variables, the models were developed by combining the use of various (i) statistical analyses to explore the relationships between the validation and the derivation datasets; (ii) adjustment methods for handling missing values; (iii) classifiers; (iv) feature selection methods; and (iv) discretization methods. We then selected the best derivation dataset and the models with the highest predictive performance. For risk factor analysis, factors in the highest-performing predictive models were analyzed and ranked using (i) statistical analyses of the best derivation dataset, (ii) feature rankers, and (iii) a newly developed algorithm to categorize risk factors as being strong, regular, or weak.

Results: The analysis dataset consisted of 2,787 CHF hospitalizations at University of Utah Health Care from January 2003 to June 2013. In this study, we used the complete-case analysis and mean-based imputation adjustment methods; the wrapper subset feature selection method; and four ranking strategies based on information gain, gain ratio, symmetrical uncertainty, and wrapper subset feature evaluators. The best-performing models resulted from the use of a complete-case analysis derivation dataset combined with the Class-Attribute Contingency Coefficient discretization method and a voting classifier which averaged the results of multi-nominal logistic regression and voting feature intervals classifiers. Of 42 final model risk factors, discharge disposition, discretized age, and indicators of anemia were the most significant. This model achieved a c-statistic of 86.8%.

Conclusion: The proposed three-step analytical approach enhanced predictive model performance for CHF readmissions. It could potentially be leveraged to improve predictive model performance in other areas of clinical medicine.

Citing Articles

Predictive Analytics in Heart Failure Risk, Readmission, and Mortality Prediction: A Review.

Hidayaturrohman Q, Hanada E Cureus. 2024; 16(11):e73876.

PMID: 39697926 PMC: 11652958. DOI: 10.7759/cureus.73876.


Feature selection and association rule learning identify risk factors of malnutrition among Ethiopian schoolchildren.

Russel W, Perry J, Bonzani C, Dontino A, Mekonnen Z, Ay A Front Epidemiol. 2024; 3:1150619.

PMID: 38455884 PMC: 10910994. DOI: 10.3389/fepid.2023.1150619.


Machine learning-based risk factor analysis and prevalence prediction of intestinal parasitic infections using epidemiological survey data.

Zafar A, Attia Z, Tesfaye M, Walelign S, Wordofa M, Abera D PLoS Negl Trop Dis. 2022; 16(6):e0010517.

PMID: 35700192 PMC: 9236253. DOI: 10.1371/journal.pntd.0010517.


The Utility of Nursing Notes Among Medicare Patients With Heart Failure to Predict 30-Day Rehospitalization: A Pilot Study.

Kang Y, Topaz M, Dunbar S, Stehlik J, Hurdle J J Cardiovasc Nurs. 2021; 37(6):E181-E186.

PMID: 34935742 PMC: 9918309. DOI: 10.1097/JCN.0000000000000871.


Post-acute care referral in United States of America: a multiregional study of factors associated with referral destination in a cohort of patients with coronary artery bypass graft or valve replacement.

Sultana I, Erraguntla M, Kum H, Delen D, Lawley M BMC Med Inform Decis Mak. 2019; 19(1):223.

PMID: 31727058 PMC: 6854767. DOI: 10.1186/s12911-019-0955-0.


References
1.
Au A, McAlister F, Bakal J, Ezekowitz J, Kaul P, van Walraven C . Predicting the risk of unplanned readmission or death within 30 days of discharge after a heart failure hospitalization. Am Heart J. 2012; 164(3):365-72. DOI: 10.1016/j.ahj.2012.06.010. View

2.
Lichtman J, Leifheit-Limson E, Jones S, Watanabe E, Bernheim S, Phipps M . Predictors of hospital readmission after stroke: a systematic review. Stroke. 2010; 41(11):2525-33. PMC: 3021413. DOI: 10.1161/STROKEAHA.110.599159. View

3.
Amalakuhan B, Kiljanek L, Parvathaneni A, Hester M, Cheriyath P, Fischman D . A prediction model for COPD readmissions: catching up, catching our breath, and improving a national problem. J Community Hosp Intern Med Perspect. 2013; 2(1). PMC: 3714087. DOI: 10.3402/jchimp.v2i1.9915. View

4.
Allaudeen N, Vidyarthi A, Maselli J, Auerbach A . Redefining readmission risk factors for general medicine patients. J Hosp Med. 2010; 6(2):54-60. DOI: 10.1002/jhm.805. View

5.
Kossovsky M, Sarasin F, Perneger T, Chopard P, Sigaud P, Gaspoz J . Unplanned readmissions of patients with congestive heart failure: do they reflect in-hospital quality of care or patient characteristics?. Am J Med. 2000; 109(5):386-90. DOI: 10.1016/s0002-9343(00)00489-7. View