» Articles » PMID: 37248730

Benign-malignant Classification of Pulmonary Nodules by Low-dose Spiral Computerized Tomography and Clinical Data with Machine Learning in Opportunistic Screening

Overview
Journal Cancer Med
Specialty Oncology
Date 2023 May 30
PMID 37248730
Authors
Affiliations
Soon will be listed here.
Abstract

Background: Many people were found with pulmonary nodules during physical examinations. It is of great practical significance to discriminate benign and malignant nodules by using data mining technology.

Methods: The subjects' demographic data, baseline examination results, and annual follow-up low-dose spiral computerized tomography (LDCT) results were recorded. The findings from annual physical examinations of positive nodules, including highly suspicious nodules and clinically tentative benign nodules, was analyzed. The extreme gradient boosting (XGBoost) model was constructed and the Grid Search CV method was used to select the super parameters. External unit data were used as an external validation set to evaluate the generalization performance of the model.

Results: A total of 135,503 physical examinees were enrolled. Baseline testing found that 27,636 (20.40%) participants had clinically tentative benign nodules and 611 (0.45%) participants had highly suspicious nodules. The proportion of highly suspicious nodules in participants with negative baseline was about 0.12%-0.46%, which was lower than the baseline level except the follow-up of >5 years. In the 27,636 participants with clinically tentative benign nodules, only in the first year of LDCT re-examination was the proportion of highly suspicious nodules (1.40%) significantly greater than that of baseline screening (0.45%) (p < 0.001), and the proportion of highly suspicious nodules was not different between the baseline screening and other follow-up years (p > 0.05). Furthermore, 322 cases with benign nodules and 196 patients with malignant nodules confirmed by surgery and pathology were compared. A model and the top 15 most important clinical variables were determined by XGBoost algorithm. The area under the curve (AUC) of the model was 0.76 [95% CI: 0.67-0.84], and the accuracy was 0.75. The sensitivity and specificity of the model under this threshold were 0.78 and 0.73, respectively. In the validation of model using external data, the AUC was 0.87 and the accuracy was 0.80. The sensitivity and specificity were 0.83 and 0.77, respectively.

Conclusions: It is important that pulmonary nodules could be more accurately identified at the first LDCT examination. A model with 15 variables which are routinely measured in the clinic could be helpful to distinguish benign and malignant nodules. It could help the radiological team issue a more accurate report; and it may guide the clinical team regarding LDCT follow-up.

Citing Articles

Differentiating Pulmonary Nodule Malignancy Using Exhaled Volatile Organic Compounds: A Prospective Observational Study.

Lu G, Su Z, Yu X, He Y, Sha T, Yan K Cancer Med. 2025; 14(1):e70545.

PMID: 39777868 PMC: 11706237. DOI: 10.1002/cam4.70545.


Non-small cell lung cancer in ever-smokers vs never-smokers.

Burt J, Qaqish N, Stoddard G, Jridi A, Anderson P, Woods L BMC Med. 2025; 23(1):3.

PMID: 39757150 PMC: 11702147. DOI: 10.1186/s12916-024-03844-8.


Benign-malignant classification of pulmonary nodules by low-dose spiral computerized tomography and clinical data with machine learning in opportunistic screening.

Zheng Y, Dong J, Yang X, Shuai P, Li Y, Li H Cancer Med. 2023; 12(11):12050-12064.

PMID: 37248730 PMC: 10278478. DOI: 10.1002/cam4.5886.

References
1.
Pahk K, Chung J, Kim S, Lee S . Predictive value of dual-time F-FDG PET/CT to distinguish primary lung and metastatic adenocarcinoma in solitary pulmonary nodule. Tumori. 2018; 104(3):207-212. DOI: 10.1177/0300891618766203. View

2.
Benzaquen J, Boutros J, Marquette C, Delingette H, Hofman P . Lung Cancer Screening, Towards a Multidimensional Approach: Why and How?. Cancers (Basel). 2019; 11(2). PMC: 6406662. DOI: 10.3390/cancers11020212. View

3.
Ji G, Bao T, Li Z, Tang H, Liu D, Yang P . Current lung cancer screening guidelines may miss high-risk population: a real-world study. BMC Cancer. 2021; 21(1):50. PMC: 7802250. DOI: 10.1186/s12885-020-07750-z. View

4.
Wang J, Dong J, Deng Z, Wang P, Zhang X, Du Y . [HPV E6 and E7 mRNA combined with HPV 16 and 18 or 45 genotyping testing as a means of cervical cancer opportunistic screening]. Zhonghua Fu Chan Ke Za Zhi. 2019; 54(5):301-306. DOI: 10.3760/cma.j.issn.0529-567x.2019.05.003. View

5.
Yang Y, Yang J, Shen L, Chen J, Xia L, Ni B . A multi-omics-based serial deep learning approach to predict clinical outcomes of single-agent anti-PD-1/PD-L1 immunotherapy in advanced stage non-small-cell lung cancer. Am J Transl Res. 2021; 13(2):743-756. PMC: 7868825. View