Prediction Model for Cardiovascular Disease in Patients with Diabetes Using Machine Learning Derived and Validated in Two Independent Korean Cohorts
Authors
Affiliations
This study aimed to develop and validate a machine learning (ML) model tailored to the Korean population with type 2 diabetes mellitus (T2DM) to provide a superior method for predicting the development of cardiovascular disease (CVD), a major chronic complication in these patients. We used data from two cohorts, namely the discovery (one hospital; n = 12,809) and validation (two hospitals; n = 2019) cohorts, recruited between 2008 and 2022. The outcome of interest was the presence or absence of CVD at 3 years. We selected various ML-based models with hyperparameter tuning in the discovery cohort and performed area under the receiver operating characteristic curve (AUROC) analysis in the validation cohort. CVD was observed in 1238 (10.2%) patients in the discovery cohort. The random forest (RF) model exhibited the best overall performance among the models, with an AUROC of 0.830 (95% confidence interval [CI] 0.818-0.842) in the discovery dataset and 0.722 (95% CI 0.660-0.783) in the validation dataset. Creatinine and glycated hemoglobin levels were the most influential factors in the RF model. This study introduces a pioneering ML-based model for predicting CVD in Korean patients with T2DM, outperforming existing prediction tools and providing a groundbreaking approach for early personalized preventive medicine.
Kim S, Kim H, Kim S, Lee H, Hammoodi A, Choi Y J Med Internet Res. 2025; 27:e62805.
PMID: 39993291 PMC: 11894353. DOI: 10.2196/62805.
Cao X, Ma J, He X, Liu Y, Yang Y, Wang Y BMC Med Inform Decis Mak. 2025; 25(1):50.
PMID: 39901185 PMC: 11792416. DOI: 10.1186/s12911-025-02885-0.
Lee H, Hwang S, Park S, Choi Y, Lee S, Park J EClinicalMedicine. 2025; 80:103069.
PMID: 39896872 PMC: 11787438. DOI: 10.1016/j.eclinm.2025.103069.
Ren W, Fan K, Liu Z, Wu Y, An H, Liu H J Diabetes. 2025; 17(1):e70049.
PMID: 39843976 PMC: 11753920. DOI: 10.1111/1753-0407.70049.
Sang H, Park J, Kim S, Lee M, Lee H, Lee S Sci Rep. 2024; 14(1):29791.
PMID: 39616163 PMC: 11608244. DOI: 10.1038/s41598-024-79654-y.