» Articles » PMID: 34667093

Machine Learning Approaches Improve Risk Stratification for Secondary Cardiovascular Disease Prevention in Multiethnic Patients

Overview
Journal Open Heart
Date 2021 Oct 20
PMID 34667093
Citations 6
Authors
Affiliations
Soon will be listed here.
Abstract

Objectives: Identifying high-risk patients is crucial for effective cardiovascular disease (CVD) prevention. It is not known whether electronic health record (EHR)-based machine-learning (ML) models can improve CVD risk stratification compared with a secondary prevention risk score developed from randomised clinical trials (Thrombolysis in Myocardial Infarction Risk Score for Secondary Prevention, TRS 2°P).

Methods: We identified patients with CVD in a large health system, including atherosclerotic CVD (ASCVD), split into 80% training and 20% test sets. A rich set of EHR patient features was extracted. ML models were trained to estimate 5-year CVD event risk (random forests (RF), gradient-boosted machines (GBM), extreme gradient-boosted models (XGBoost), logistic regression with an L penalty and L penalty (Lasso)). ML models and TRS 2°P were evaluated by the area under the receiver operating characteristic curve (AUC).

Results: The cohort included 32 192 patients (median age 74 years, with 46% female, 63% non-Hispanic white and 12% Asian patients and 23 475 patients with ASCVD). There were 4010 events over 5 years of follow-up. ML models demonstrated good overall performance; XGBoost demonstrated AUC 0.70 (95% CI 0.68 to 0.71) in the full CVD cohort and AUC 0.71 (95% CI 0.69 to 0.73) in patients with ASCVD, with comparable performance by GBM, RF and Lasso. TRS 2°P performed poorly in all CVD (AUC 0.51, 95% CI 0.50 to 0.53) and ASCVD (AUC 0.50, 95% CI 0.48 to 0.52) patients. ML identified nontraditional predictive variables including education level and primary care visits.

Conclusions: In a multiethnic real-world population, EHR-based ML approaches significantly improved CVD risk stratification for secondary prevention.

Citing Articles

Unlocking the link: predicting cardiovascular disease risk with a focus on airflow obstruction using machine learning.

Cao X, Ma J, He X, Liu Y, Yang Y, Wang Y BMC Med Inform Decis Mak. 2025; 25(1):50.

PMID: 39901185 PMC: 11792416. DOI: 10.1186/s12911-025-02885-0.


Artificial Intelligence in Ischemic Heart Disease Prevention.

Parsa S, Shah P, Doijad R, Rodriguez F Curr Cardiol Rep. 2025; 27(1):44.

PMID: 39891819 DOI: 10.1007/s11886-025-02203-0.


Uses of Social Determinants of Health Data to Address Cardiovascular Disease and Health Equity: A Scoping Review.

McNeill E, Lindenfeld Z, Mostafa L, Zein D, Silver D, Pagan J J Am Heart Assoc. 2023; 12(21):e030571.

PMID: 37929716 PMC: 10727404. DOI: 10.1161/JAHA.123.030571.


A systematic review of clinical health conditions predicted by machine learning diagnostic and prognostic models trained or validated using real-world primary health care data.

Abdulazeem H, Whitelaw S, Schauberger G, Klug S PLoS One. 2023; 18(9):e0274276.

PMID: 37682909 PMC: 10491005. DOI: 10.1371/journal.pone.0274276.


The value of parental medical records for the prediction of diabetes and cardiovascular disease: a novel method for generating and incorporating family histories.

Barak-Corren Y, Tsurel D, Keidar D, Gofer I, Shahaf D, Leventer-Roberts M J Am Med Inform Assoc. 2023; 30(12):1915-1924.

PMID: 37535812 PMC: 10654871. DOI: 10.1093/jamia/ocad154.


References
1.
Wong E, Palaniappan L, Lauderdale D . Using name lists to infer Asian racial/ethnic subgroups in the healthcare setting. Med Care. 2010; 48(6):540-6. PMC: 3249427. DOI: 10.1097/MLR.0b013e3181d559e9. View

2.
Bergmark B, Bhatt D, Braunwald E, Morrow D, Steg P, Gurmu Y . Risk Assessment in Patients With Diabetes With the TIMI Risk Score for Atherothrombotic Disease. Diabetes Care. 2017; 41(3):577-585. PMC: 5829964. DOI: 10.2337/dc17-1736. View

3.
Mora S, Wenger N, Cook N, Liu J, Howard B, Limacher M . Evaluation of the Pooled Cohort Risk Equations for Cardiovascular Risk Prediction in a Multiethnic Cohort From the Women's Health Initiative. JAMA Intern Med. 2018; 178(9):1231-1240. PMC: 6142964. DOI: 10.1001/jamainternmed.2018.2875. View

4.
Nguyen Q, Odden M, Peralta C, Kim D . Predicting Risk of Atherosclerotic Cardiovascular Disease Using Pooled Cohort Equations in Older Adults With Frailty, Multimorbidity, and Competing Risks. J Am Heart Assoc. 2020; 9(18):e016003. PMC: 7727000. DOI: 10.1161/JAHA.119.016003. View

5.
Murphy C, Bennett K, Fahey T, Shelley E, Graham I, Kenny R . Statin use in adults at high risk of cardiovascular disease mortality: cross-sectional analysis of baseline data from The Irish Longitudinal Study on Ageing (TILDA). BMJ Open. 2015; 5(7):e008017. PMC: 4513517. DOI: 10.1136/bmjopen-2015-008017. View