» Articles » PMID: 38464464

Efficient Management of Pulmonary Embolism Diagnosis Using a Two-step Interconnected Machine Learning Model Based on Electronic Health Records Data

Overview
Publisher Springer
Date 2024 Mar 11
PMID 38464464
Authors
Affiliations
Soon will be listed here.
Abstract

Pulmonary Embolism (PE) is a life-threatening clinical disease with no specific clinical symptoms and Computed Tomography Angiography (CTA) is used for diagnosis. Clinical decision support scoring systems like Wells and rGeneva based on PE risk factors have been developed to estimate the pre-test probability but are underused, leading to continuous overuse of CTA imaging. This diagnostic study aimed to propose a novel approach for efficient management of PE diagnosis using a two-step interconnected machine learning framework directly by analyzing patients' Electronic Health Records data. First, we performed feature importance analysis according to the result of LightGBM superiority for PE prediction, then four state-of-the-art machine learning methods were applied for PE prediction based on the feature importance results, enabling swift and accurate pre-test diagnosis. Throughout the study patients' data from different departments were collected from Sina educational hospital, affiliated with the Tehran University of medical sciences in Iran. Generally, the Ridge classification method obtained the best performance with an F1 score of 0.96. Extensive experimental findings showed the effectiveness and simplicity of this diagnostic process of PE in comparison with the existing scoring systems. The main strength of this approach centered on PE disease management procedures, which would reduce avoidable invasive CTA imaging and be applied as a primary prognosis of PE, hence assisting the healthcare system, clinicians, and patients by reducing costs and promoting treatment quality and patient satisfaction.

Citing Articles

Multiple feature selection based on an optimization strategy for causal analysis of health data.

Cong R, Deng O, Nishimura S, Ogihara A, Jin Q Health Inf Sci Syst. 2024; 12(1):52.

PMID: 39534650 PMC: 11554952. DOI: 10.1007/s13755-024-00312-8.


Genetic factors, risk prediction and AI application of thrombotic diseases.

Wang R, Tang L, Hu Y Exp Hematol Oncol. 2024; 13(1):89.

PMID: 39192370 PMC: 11348605. DOI: 10.1186/s40164-024-00555-x.

References
1.
Arbet J, Brokamp C, Meinzen-Derr J, Trinkley K, Spratt H . Lessons and tips for designing a machine learning study using EHR data. J Clin Transl Sci. 2021; 5(1):e21. PMC: 8057454. DOI: 10.1017/cts.2020.513. View

2.
Tariq A, Celi L, Newsome J, Purkayastha S, Bhatia N, Trivedi H . Patient-specific COVID-19 resource utilization prediction using fusion AI model. NPJ Digit Med. 2021; 4(1):94. PMC: 8175333. DOI: 10.1038/s41746-021-00461-0. View

3.
Bertsimas D, Borenstein A, Mingardi L, Nohadani O, Orfanoudaki A, Stellato B . Personalized prescription of ACEI/ARBs for hypertensive COVID-19 patients. Health Care Manag Sci. 2021; 24(2):339-355. PMC: 7958102. DOI: 10.1007/s10729-021-09545-5. View

4.
van Es N, Kraaijpoel N, Klok F, Huisman M, den Exter P, Mos I . The original and simplified Wells rules and age-adjusted D-dimer testing to rule out pulmonary embolism: an individual patient data meta-analysis. J Thromb Haemost. 2017; 15(4):678-684. DOI: 10.1111/jth.13630. View

5.
Zhang C, Ding Y, Peng Q . Who determines United States Healthcare out-of-pocket costs? Factor ranking and selection using ensemble learning. Health Inf Sci Syst. 2021; 9(1):22. PMC: 8184979. DOI: 10.1007/s13755-021-00153-9. View