» Articles » PMID: 33806973

Prediction of Type 2 Diabetes Based on Machine Learning Algorithm

Overview
Publisher MDPI
Date 2021 Apr 3
PMID 33806973
Citations 47
Authors
Affiliations
Soon will be listed here.
Abstract

Prediction of type 2 diabetes (T2D) occurrence allows a person at risk to take actions that can prevent onset or delay the progression of the disease. In this study, we developed a machine learning (ML) model to predict T2D occurrence in the following year (Y + 1) using variables in the current year (Y). The dataset for this study was collected at a private medical institute as electronic health records from 2013 to 2018. To construct the prediction model, key features were first selected using ANOVA tests, chi-squared tests, and recursive feature elimination methods. The resultant features were fasting plasma glucose (FPG), HbA1c, triglycerides, BMI, gamma-GTP, age, uric acid, sex, smoking, drinking, physical activity, and family history. We then employed logistic regression, random forest, support vector machine, XGBoost, and ensemble machine learning algorithms based on these variables to predict the outcome as normal (non-diabetic), prediabetes, or diabetes. Based on the experimental results, the performance of the prediction model proved to be reasonably good at forecasting the occurrence of T2D in the Korean population. The model can provide clinicians and patients with valuable predictive information on the likelihood of developing T2D. The cross-validation (CV) results showed that the ensemble models had a superior performance to that of the single models. The CV performance of the prediction models was improved by incorporating more medical history from the dataset.

Citing Articles

Construction and Verification of a Frailty Risk Prediction Model for Elderly Patients with Coronary Heart Disease Based on a Machine Learning Algorithm.

Cao J, Zhang L, Zhou X Rev Cardiovasc Med. 2025; 26(2):26225.

PMID: 40026519 PMC: 11868882. DOI: 10.31083/RCM26225.


Development and validation of predictive models for diabetic retinopathy using machine learning.

Yang P, Yang B PLoS One. 2025; 20(2):e0318226.

PMID: 39992900 PMC: 11849896. DOI: 10.1371/journal.pone.0318226.


Prevalence of Musculoskeletal Disorders in Heavy Vehicle Drivers and Office Workers: A Comparative Analysis Using a Machine Learning Approach.

Raza M, Bhushan R, Khan A, Ali A, Khamaj A, Alam M Healthcare (Basel). 2025; 12(24.

PMID: 39765986 PMC: 11675938. DOI: 10.3390/healthcare12242560.


Constructing a predictive model for early-onset sepsis in neonatal intensive care unit newborns based on SHapley Additive exPlanations explainable machine learning.

Tan X, Zhang X, Chai J, Ji W, Ru J, Yang C Transl Pediatr. 2024; 13(11):1933-1946.

PMID: 39649648 PMC: 11621883. DOI: 10.21037/tp-24-278.


Stacking model framework reveals clinical biochemical data and dietary behavior features associated with type 2 diabetes: A retrospective cohort study.

Fu Y, Liang X, Yang X, Li L, Meng L, Wei Y APL Bioeng. 2024; 8(4):046111.

PMID: 39583336 PMC: 11584240. DOI: 10.1063/5.0207658.


References
1.
Choi B, Rha S, Kim S, Kang J, Park J, Noh Y . Machine Learning for the Prediction of New-Onset Diabetes Mellitus during 5-Year Follow-up in Non-Diabetic Patients with Cardiovascular Risks. Yonsei Med J. 2019; 60(2):191-199. PMC: 6342710. DOI: 10.3349/ymj.2019.60.2.191. View

2.
Eliasson B . Cigarette smoking and diabetes. Prog Cardiovasc Dis. 2003; 45(5):405-13. DOI: 10.1053/pcad.2003.00103. View

3.
Zou Q, Qu K, Luo Y, Yin D, Ju Y, Tang H . Predicting Diabetes Mellitus With Machine Learning Techniques. Front Genet. 2018; 9:515. PMC: 6232260. DOI: 10.3389/fgene.2018.00515. View

4.
Lee Y, Bang H, Kim H, Kim H, Park S, Kim D . A simple screening score for diabetes for the Korean population: development, validation, and comparison with other scores. Diabetes Care. 2012; 35(8):1723-30. PMC: 3402268. DOI: 10.2337/dc11-2347. View

5.
Yin Z, Zhang J . Operator functional state classification using least-square support vector machine based recursive feature elimination technique. Comput Methods Programs Biomed. 2013; 113(1):101-15. DOI: 10.1016/j.cmpb.2013.09.007. View