Comparative Study on Risk Prediction Model of Type 2 Diabetes Based on Machine Learning Theory: a Cross-sectional Study

Overview

Journal BMJ Open

Specialty General Medicine

Date 2023 Aug 29

PMID 37643856

Authors

Shu Wang

Rong Chen

Shuang Wang

Danli Kong

Rudai Cao

Chunwen Lin

Ling Luo

Jialu Huang

Qiaoli Zhang

Haibing Yu

Yuan Lin Ding

Affiliations

Soon will be listed here.

Abstract

Objectives: To compare the prediction effects of six models based on machine learning theories, which can provide a methodological reference for predicting the risk of type 2 diabetes mellitus (T2DM).

Setting And Participants: This study was based on the monitoring data of chronic disease risk factors in Dongguan residents from 2016 to 2018. The multistage cluster random sampling method was adopted at each monitoring site, and 4157 people were finally selected. In the initial population, we excluded individuals with more than 20% missing data and eventually included 4106 subjects.

Design: K nearest neighbour algorithm and synthetic minority oversampling technique were used to process the data. Single factor analysis was used for preliminary selection of variables. The 10-fold cross-validation was used to optimise the parameters of some models. The accuracy, precision, recall and area under receiver operating characteristic curve (AUC) were used to evaluate the prediction effect of models, and Delong test was used to analyse the differences of AUC values of each model.

Results: After balancing data, the sample size increased to 8013, of which 4023 are patients with T2DM and 3990 in control group. The comparison results of the six models showed that back propagation neural network model has the best prediction effect with 93.7% accuracy, 94.6% accuracy, 92.8% recall and the AUC value of 0.977, followed by logistic model, support vector machine model, CART decision tree model and C4.5 decision tree model. Deep neural network has the worst prediction performance, with 84.5% accuracy, 86.1% precision, 82.9% recall and the AUC value of 0.845.

Conclusions: In this study, six types of risk prediction models for T2DM were constructed, and the predictive effects of these models were compared based on various indicators. The results showed that back propagation neural network based on the selected data set had the best prediction effect.

Citing Articles

Predicting three-month fasting blood glucose and glycated hemoglobin changes in patients with type 2 diabetes mellitus based on multiple machine learning algorithms.

Tao X, Jiang M, Liu Y, Hu Q, Zhu B, Hu J Sci Rep. 2023; 13(1):16437.

PMID: 37777593 PMC: 10543442. DOI: 10.1038/s41598-023-43240-5.

References

Cheruku R, Edla D, Kuppili V . SM-RuleMiner: Spider monkey based rule miner using novel fitness function for diabetes classification. Comput Biol Med. 2016; 81:79-92. DOI: 10.1016/j.compbiomed.2016.12.009. View

Haeberle H, Helm J, Navarro S, Karnuta J, Schaffer J, Callaghan J . Artificial Intelligence and Machine Learning in Lower Extremity Arthroplasty: A Review. J Arthroplasty. 2019; 34(10):2201-2203. DOI: 10.1016/j.arth.2019.05.055. View

Naylor C . On the Prospects for a (Deep) Learning Health Care System. JAMA. 2018; 320(11):1099-1100. DOI: 10.1001/jama.2018.11103. View

Bini S . Artificial Intelligence, Machine Learning, Deep Learning, and Cognitive Computing: What Do These Terms Mean and How Will They Impact Health Care?. J Arthroplasty. 2018; 33(8):2358-2361. DOI: 10.1016/j.arth.2018.02.067. View

Meng X, Huang Y, Rao D, Zhang Q, Liu Q . Comparison of three data mining models for predicting diabetes or prediabetes by risk factors. Kaohsiung J Med Sci. 2013; 29(2):93-9. DOI: 10.1016/j.kjms.2012.08.016. View

Battineni G, Sagaro G, Chinatalapudi N, Amenta F . Applications of Machine Learning Predictive Models in the Chronic Disease Diagnosis. J Pers Med. 2020; 10(2). PMC: 7354442. DOI: 10.3390/jpm10020021. View

Liu S, Gao Y, Shen Y, Zhang M, Li J, Sun P . Application of three statistical models for predicting the risk of diabetes. BMC Endocr Disord. 2019; 19(1):126. PMC: 6878628. DOI: 10.1186/s12902-019-0456-2. View

Sun H, Saeedi P, Karuranga S, Pinkepank M, Ogurtsova K, Duncan B . IDF Diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045. Diabetes Res Clin Pract. 2021; 183:109119. PMC: 11057359. DOI: 10.1016/j.diabres.2021.109119. View

Harris M, Klein R, Welborn T, Knuiman M . Onset of NIDDM occurs at least 4-7 yr before clinical diagnosis. Diabetes Care. 1992; 15(7):815-9. DOI: 10.2337/diacare.15.7.815. View

10.

Knowler W, Barrett-Connor E, Fowler S, Hamman R, Lachin J, Walker E . Reduction in the incidence of type 2 diabetes with lifestyle intervention or metformin. N Engl J Med. 2002; 346(6):393-403. PMC: 1370926. DOI: 10.1056/NEJMoa012512. View

11.

Zinman B . The International Diabetes Federation World Diabetes Congress 2015. Eur Endocrinol. 2018; 11(2):66. PMC: 5819067. DOI: 10.17925/EE.2015.11.02.66. View

12.

Kavakiotis I, Tsave O, Salifoglou A, Maglaveras N, Vlahavas I, Chouvarda I . Machine Learning and Data Mining Methods in Diabetes Research. Comput Struct Biotechnol J. 2017; 15:104-116. PMC: 5257026. DOI: 10.1016/j.csbj.2016.12.005. View

13.

Li G, Zhang P, Wang J, Gregg E, Yang W, Gong Q . The long-term effect of lifestyle interventions to prevent diabetes in the China Da Qing Diabetes Prevention Study: a 20-year follow-up study. Lancet. 2008; 371(9626):1783-9. DOI: 10.1016/S0140-6736(08)60766-7. View