» Articles » PMID: 39710777

Predicting the Risk of Pulmonary Embolism in Patients with Tuberculosis Using Machine Learning Algorithms

Overview
Journal Eur J Med Res
Publisher Biomed Central
Specialty General Medicine
Date 2024 Dec 22
PMID 39710777
Authors
Affiliations
Soon will be listed here.
Abstract

Background: This study aimed to develop predictive models with robust generalization capabilities for assessing the risk of pulmonary embolism in patients with tuberculosis using machine learning algorithms.

Methods: Data were collected from two centers and categorized into development and validation cohorts. Using the development cohort, candidate variables were selected via the Recursive Feature Elimination (RFE) method. Five machine learning algorithms, logistic regression (LR), random forest (RF), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and support vector machine (SVM), were utilized to construct the predictive models. Model performance was evaluated through nested cross-validation and area under the curve (AUC) metrics, supplemented by interpretations using Shapley Additive explanations (SHAP) and line charts of AUC values. Models were subjected to external validation using an independent validation group, facilitating the early identification and management of pulmonary embolism risks in tuberculosis patients.

Results: Data from 694 patients were used for model development, and 236 patients from the validation group met the enrollment criteria. The optimal subset of variables identified included D-dimer, smoking status, dyspnea, age, sex, diabetes, platelet count, cough, fibrinogen, hemoglobin, hemoptysis, hypertension, chronic obstructive pulmonary disease (COPD), and chest pain. The RF model outperformed others, achieving an AUC of 0.839 (95% CI 0.780-0.899) and maintaining the highest average performance in external fivefold cross-validation (AUC: 0.906 ± 0.041).

Conclusions: The RF model demonstrates high and consistent effectiveness in predicting pulmonary embolism risk in tuberculosis patients.

References
1.
Sultan A, Elgharib M, Tavares T, Jessri M, Basile J . The use of artificial intelligence, machine learning and deep learning in oncologic histopathology. J Oral Pathol Med. 2020; 49(9):849-856. DOI: 10.1111/jop.13042. View

2.
Han X, Li C, Zhang S, Hou X, Chen Z, Zhang J . Why thromboembolism occurs in some patients with thrombocytopenia and treatment strategies. Thromb Res. 2020; 196:500-509. DOI: 10.1016/j.thromres.2020.10.005. View

3.
Zhang L, Huang T, Xu F, Li S, Zheng S, Lyu J . Prediction of prognosis in elderly patients with sepsis based on machine learning (random survival forest). BMC Emerg Med. 2022; 22(1):26. PMC: 8832779. DOI: 10.1186/s12873-022-00582-z. View

4.
Cohoon K, Ashrani A, Crusan D, Petterson T, Bailey K, Heit J . Is Infection an Independent Risk Factor for Venous Thromboembolism? A Population-Based, Case-Control Study. Am J Med. 2017; 131(3):307-316.e2. PMC: 5817009. DOI: 10.1016/j.amjmed.2017.09.015. View

5.
van Belle A, Buller H, Huisman M, Huisman P, Kaasjager K, Kamphuisen P . Effectiveness of managing suspected pulmonary embolism using an algorithm combining clinical probability, D-dimer testing, and computed tomography. JAMA. 2006; 295(2):172-9. DOI: 10.1001/jama.295.2.172. View