» Articles » PMID: 35512622

Automating and Improving Cardiovascular Disease Prediction Using Machine Learning and EMR Data Features from a Regional Healthcare System

Overview
Date 2022 May 5
PMID 35512622
Authors
Affiliations
Soon will be listed here.
Abstract

Background: The ACC/AHA Pooled Cohort Equations (PCE) Risk Calculator is widely used in the US for primary prevention of atherosclerotic cardiovascular disease (ASCVD), but may under- or over-estimate risk in some populations. We therefore designed an automated, population-specific ASCVD risk calculator using machine-learning (ML) methods and electronic medical record (EMR) data, and compared its predictive power with that of the PCE calculator.

Methods And Findings: We collected data from 101,110 unique EMRs of living patients from January 1, 2009 to April 30, 2020. ML techniques were applied to patient datasets that included either only cross-sectional (CS) features, or CS combined with longitudinal (LT) features derived from vital statistics and laboratory values. We compared the utility of the models using a proposed new cost measure (Screened Cases Percentage @ Sensitivity level). All ML models tested achieved better predictive power than the PCE risk calculator. The random forest ML technique (RF) applied on the combination of CS and LT features (RF-LTC) produced the best area under curve (AUC) score of 0.902 (95% confidence interval (CI), 0.895-0.910). To detect 90% of all positive ASCVD cases, the best ML model required screening only 43% of patients, while the PCE risk calculator required screening 69% of patients.

Conclusions: Prediction models built using ML techniques improved ASCVD prediction and reduced the number of screenings required to predict ASCVD when compared with the PCE calculator, alone. Combining LT and CS features in the ML models significantly improved ASCVD prediction compared with using CS features, alone.

Citing Articles

Machine learning based prediction models for cardiovascular disease risk using electronic health records data: systematic review and meta-analysis.

Liu T, Krentz A, Lu L, Curcin V Eur Heart J Digit Health. 2025; 6(1):7-22.

PMID: 39846062 PMC: 11750195. DOI: 10.1093/ehjdh/ztae080.


Increasing provider awareness of Lp(a) testing for patients at risk for cardiovascular disease: A comparative study.

Eid W, Sapp E, Conroy C, Bessinger C, Moody C, Yadav R Am J Prev Cardiol. 2024; 21:100895.

PMID: 39720768 PMC: 11666892. DOI: 10.1016/j.ajpc.2024.100895.


Detecting cardiovascular diseases using unsupervised machine learning clustering based on electronic medical records.

Hu Y, Yan H, Liu M, Gao J, Xie L, Zhang C BMC Med Res Methodol. 2024; 24(1):309.

PMID: 39702064 PMC: 11658374. DOI: 10.1186/s12874-024-02422-z.


CardioRiskNet: A Hybrid AI-Based Model for Explainable Risk Prediction and Prognosis in Cardiovascular Disease.

Talaat F, Elnaggar A, Shaban W, Shehata M, Elhosseini M Bioengineering (Basel). 2024; 11(8).

PMID: 39199780 PMC: 11351968. DOI: 10.3390/bioengineering11080822.


The Use of Deep Learning and Machine Learning on Longitudinal Electronic Health Records for the Early Detection and Prevention of Diseases: Scoping Review.

Swinckels L, Bennis F, Ziesemer K, Scheerman J, Bijwaard H, de Keijzer A J Med Internet Res. 2024; 26:e48320.

PMID: 39163096 PMC: 11372333. DOI: 10.2196/48320.