Uncertainty-aware Automatic TNM Staging Classification for [F] Fluorodeoxyglucose PET-CT Reports for Lung Cancer Utilising Transformer-based Language Models and Multi-task Learning

Overview

Journal BMC Med Inform Decis Mak

Publisher Biomed Central

Specialty Medical Informatics

Date 2024 Dec 19

PMID 39695672

Authors

Stephen H Barlow

Sugama Chicklore

Yulan He

Sebastien Ourselin

Thomas Wagner

Anna Barnes

Gary J R Cook

Affiliations

Soon will be listed here.

Abstract

Background: [F] Fluorodeoxyglucose (FDG) PET-CT is a clinical imaging modality widely used in diagnosing and staging lung cancer. The clinical findings of PET-CT studies are contained within free text reports, which can currently only be categorised by experts manually reading them. Pre-trained transformer-based language models (PLMs) have shown success in extracting complex linguistic features from text. Accordingly, we developed a multi-task 'TNMu' classifier to classify the presence/absence of tumour, node, metastasis ('TNM') findings (as defined by The Eight Edition of TNM Staging for Lung Cancer). This is combined with an uncertainty classification task ('u') to account for studies with ambiguous TNM status.

Methods: 2498 reports were annotated by a nuclear medicine physician and split into train, validation, and test datasets. For additional evaluation an external dataset (n = 461 reports) was created, and annotated by two nuclear medicine physicians with agreement reached on all examples. We trained and evaluated eleven publicly available PLMs to determine which is most effective for PET-CT reports, and compared multi-task, single task and traditional machine learning approaches.

Results: We find that a multi-task approach with GatorTron as PLM achieves the best performance, with an overall accuracy (all four tasks correct) of 84% and a Hamming loss of 0.05 on the internal test dataset, and 79% and 0.07 on the external test dataset. Performance on the individual TNM tasks approached expert performance with macro average F1 scores of 0.91, 0.95 and 0.90 respectively on external data. For uncertainty an F1 of 0.77 is achieved.

Conclusions: Our 'TNMu' classifier successfully extracts TNM staging information from internal and external PET-CT reports. We concluded that multi-task approaches result in the best performance, and better computational efficiency over single task PLM approaches. We believe these models can improve PET-CT services by assisting in auditing, creating research cohorts, and developing decision support systems. Our approach to handling uncertainty represents a novel first step but has room for further refinement.

References

Yang C, Xiao Y, Zhang Y, Sun Y, Han J . Heterogeneous Network Representation Learning: A Unified Framework with Survey and Benchmark. IEEE Trans Knowl Data Eng. 2023; 34(10):4854-4873. PMC: 10619966. DOI: 10.1109/tkde.2020.3045924. View

Audi S, Pencharz D, Wagner T . Behind the hedges: how to convey uncertainty in imaging reports. Clin Radiol. 2020; 76(2):84-87. DOI: 10.1016/j.crad.2020.08.003. View

Zhou B, Yang G, Shi Z, Ma S . Natural Language Processing for Smart Healthcare. IEEE Rev Biomed Eng. 2022; 17:4-18. DOI: 10.1109/RBME.2022.3210270. View

Zaman S, Petri C, Vimalesvaran K, Howard J, Bharath A, Francis D . Automatic Diagnosis Labeling of Cardiovascular MRI by Using Semisupervised Natural Language Processing of Text Reports. Radiol Artif Intell. 2022; 4(1):e210085. PMC: 8823679. DOI: 10.1148/ryai.210085. View

Lababede O, Meziane M . The Eighth Edition of TNM Staging of Lung Cancer: Reference Chart and Diagrams. Oncologist. 2018; 23(7):844-848. PMC: 6058324. DOI: 10.1634/theoncologist.2017-0659. View

Chen M, Ball R, Yang L, Moradzadeh N, Chapman B, Larson D . Deep Learning to Classify Radiology Free-Text Reports. Radiology. 2017; 286(3):845-852. DOI: 10.1148/radiol.2017171115. View

Batch K, Yue J, Darcovich A, Lupton K, Liu C, Woodlock D . Developing a Cancer Digital Twin: Supervised Metastases Detection From Consecutive Structured Radiology Reports. Front Artif Intell. 2022; 5:826402. PMC: 8924403. DOI: 10.3389/frai.2022.826402. View

Lee J, Yoon W, Kim S, Kim D, Kim S, So C . BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2019; 36(4):1234-1240. PMC: 7703786. DOI: 10.1093/bioinformatics/btz682. View

Rohren E . Positron emission tomography-computed tomography reporting in radiation therapy planning and response assessment. Semin Ultrasound CT MR. 2010; 31(6):516-29. DOI: 10.1053/j.sult.2010.08.002. View

10.

Huemann Z, Lee C, Hu J, Cho S, Bradshaw T . Domain-adapted Large Language Models for Classifying Nuclear Medicine Reports. Radiol Artif Intell. 2023; 5(6):e220281. PMC: 10698610. DOI: 10.1148/ryai.220281. View

11.

Yang X, Chen A, PourNejatian N, Shin H, Smith K, Parisien C . A large language model for electronic health records. NPJ Digit Med. 2022; 5(1):194. PMC: 9792464. DOI: 10.1038/s41746-022-00742-2. View

12.

Moss C, Haire A, Cahill F, Enting D, Hughes S, Smith D . Guy's cancer cohort - real world evidence for cancer pathways. BMC Cancer. 2020; 20(1):187. PMC: 7077127. DOI: 10.1186/s12885-020-6667-0. View

13.

Niederkohr R, Greenspan B, Prior J, Schoder H, Seltzer M, Zukotynski K . Reporting guidance for oncologic 18F-FDG PET/CT imaging. J Nucl Med. 2013; 54(5):756-61. DOI: 10.2967/jnumed.112.112177. View

14.

Yetisgen-Yildiz M, Gunn M, Xia F, Payne T . Automatic identification of critical follow-up recommendation sentences in radiology reports. AMIA Annu Symp Proc. 2011; 2011:1593-602. PMC: 3243284. View

15.

Sippo D, Warden G, Andriole K, Lacson R, Ikuta I, Birdwell R . Automated extraction of BI-RADS final assessment categories from radiology reports with natural language processing. J Digit Imaging. 2013; 26(5):989-94. PMC: 3782591. DOI: 10.1007/s10278-013-9616-5. View

16.

Yan A, McAuley J, Lu X, Du J, Chang E, Gentili A . RadBERT: Adapting Transformer-based Language Models to Radiology. Radiol Artif Intell. 2022; 4(4):e210258. PMC: 9344353. DOI: 10.1148/ryai.210258. View

17.

Farsad M . FDG PET/CT in the Staging of Lung Cancer. Curr Radiopharm. 2019; 13(3):195-203. PMC: 8206197. DOI: 10.2174/1874471013666191223153755. View

18.

Lovinfosse P, Polus M, Van Daele D, Martinive P, Daenen F, Hatt M . FDG PET/CT radiomics for predicting the outcome of locally advanced rectal cancer. Eur J Nucl Med Mol Imaging. 2017; 45(3):365-375. DOI: 10.1007/s00259-017-3855-5. View

19.

Nishigaki D, Suzuki Y, Wataya T, Kita K, Yamagata K, Sato J . BERT-based Transfer Learning in Sentence-level Anatomic Classification of Free-Text Radiology Reports. Radiol Artif Intell. 2023; 5(2):e220097. PMC: 10077075. DOI: 10.1148/ryai.220097. View

20.

Yim W, Kwan S, Johnson G, Yetisgen M . Classification of hepatocellular carcinoma stages from free-text clinical and radiology reports. AMIA Annu Symp Proc. 2018; 2017:1858-1867. PMC: 5977638. View