Expert-centered Evaluation of Deep Learning Algorithms for Brain Tumor Segmentation

Overview

Journal Radiol Artif Intell

Specialties Biomedical Engineering
Radiology

Date 2024 Jan 10

PMID 38197800

Authors

Katharina V Hoebel

Christopher P Bridge

Sara Ahmed

Oluwatosin Akintola

Caroline Chung

Raymond Y Huang

Jason M Johnson

Albert Kim

K Ina Ly

Ken Chang

Jay Patel

Marco Pinho

Tracy T Batchelor

Bruce R Rosen

Elizabeth R Gerstner

Jayashree Kalpathy-Cramer

Affiliations

Soon will be listed here.

Abstract

Purpose To present results from a literature survey on practices in deep learning segmentation algorithm evaluation and perform a study on expert quality perception of brain tumor segmentation. Materials and Methods A total of 180 articles reporting on brain tumor segmentation algorithms were surveyed for the reported quality evaluation. Additionally, ratings of segmentation quality on a four-point scale were collected from medical professionals for 60 brain tumor segmentation cases. Results Of the surveyed articles, Dice score, sensitivity, and Hausdorff distance were the most popular metrics to report segmentation performance. Notably, only 2.8% of the articles included clinical experts' evaluation of segmentation quality. The experimental results revealed a low interrater agreement (Krippendorff α, 0.34) in experts' segmentation quality perception. Furthermore, the correlations between the ratings and commonly used quantitative quality metrics were low (Kendall tau between Dice score and mean rating, 0.23; Kendall tau between Hausdorff distance and mean rating, 0.51), with large variability among the experts. Conclusion The results demonstrate that quality ratings are prone to variability due to the ambiguity of tumor boundaries and individual perceptual differences, and existing metrics do not capture the clinical perception of segmentation quality. Brain Tumor Segmentation, Deep Learning Algorithms, Glioblastoma, Cancer, Machine Learning Clinical trial registration nos. NCT00756106 and NCT00662506 © RSNA, 2023.

Citing Articles

A review of deep learning for brain tumor analysis in MRI.

Dorfner F, Patel J, Kalpathy-Cramer J, Gerstner E, Bridge C NPJ Precis Oncol. 2025; 9(1):2.

PMID: 39753730 PMC: 11698745. DOI: 10.1038/s41698-024-00789-2.

Automated brain segmentation and volumetry in dementia diagnostics: a narrative review with emphasis on FreeSurfer.

Khadhraoui E, Nickl-Jockschat T, Henkes H, Behme D, Muller S Front Aging Neurosci. 2024; 16:1459652.

PMID: 39291276 PMC: 11405240. DOI: 10.3389/fnagi.2024.1459652.

References

Gorgolewski K, Burns C, Madison C, Clark D, Halchenko Y, Waskom M . Nipype: a flexible, lightweight and extensible neuroimaging data processing framework in python. Front Neuroinform. 2011; 5:13. PMC: 3159964. DOI: 10.3389/fninf.2011.00013. View

Wang G, Li W, Zuluaga M, Pratt R, Patel P, Aertsen M . Interactive Medical Image Segmentation Using Deep Learning With Image-Specific Fine Tuning. IEEE Trans Med Imaging. 2018; 37(7):1562-1573. PMC: 6051485. DOI: 10.1109/TMI.2018.2791721. View

Kalpathy-Cramer J, Campbell J, Erdogmus D, Tian P, Kedarisetti D, Moleta C . Plus Disease in Retinopathy of Prematurity: Improving Diagnosis by Ranking Disease Severity and Using Quantitative Image Analysis. Ophthalmology. 2016; 123(11):2345-2351. PMC: 5077696. DOI: 10.1016/j.ophtha.2016.07.020. View

Batchelor T, Gerstner E, Emblem K, Duda D, Kalpathy-Cramer J, Snuderl M . Improved tumor oxygenation and survival in glioblastoma patients who show increased blood perfusion after cediranib and chemoradiation. Proc Natl Acad Sci U S A. 2013; 110(47):19059-64. PMC: 3839699. DOI: 10.1073/pnas.1318022110. View

Lambert S, Madi M, Sopka S, Lenes A, Stange H, Buszello C . An integrative review on the acceptance of artificial intelligence among healthcare professionals in hospitals. NPJ Digit Med. 2023; 6(1):111. PMC: 10257646. DOI: 10.1038/s41746-023-00852-5. View

Lu S, Xiao F, Cheng J, Yang W, Cheng Y, Chang Y . Randomized multi-reader evaluation of automated detection and segmentation of brain tumors in stereotactic radiosurgery with deep neural networks. Neuro Oncol. 2021; 23(9):1560-1568. PMC: 8408868. DOI: 10.1093/neuonc/noab071. View

Iglesias J, Liu C, Thompson P, Tu Z . Robust brain extraction across datasets and comparison with publicly available methods. IEEE Trans Med Imaging. 2011; 30(9):1617-34. DOI: 10.1109/TMI.2011.2138152. View

Ieva A, Russo C, Liu S, Jian A, Bai M, Qian Y . Application of deep learning for automatic segmentation of brain tumors on magnetic resonance imaging: a heuristic approach in the clinical scenario. Neuroradiology. 2021; 63(8):1253-1262. DOI: 10.1007/s00234-021-02649-3. View

Conte G, Weston A, Vogelsang D, Philbrick K, Cai J, Barbera M . Generative Adversarial Networks to Synthesize Missing T1 and FLAIR MRI Sequences for Use in a Multisequence Brain Tumor Segmentation Model. Radiology. 2021; 299(2):313-323. PMC: 8111364. DOI: 10.1148/radiol.2021203786. View

10.

van der Veen J, Gulyban A, Nuyts S . Interobserver variability in delineation of target volumes in head and neck cancer. Radiother Oncol. 2019; 137:9-15. DOI: 10.1016/j.radonc.2019.04.006. View

11.

Li M, Little B, Alkasab T, Mendoza D, Succi M, Shepard J . Multi-Radiologist User Study for Artificial Intelligence-Guided Grading of COVID-19 Lung Disease Severity on Chest Radiographs. Acad Radiol. 2021; 28(4):572-576. PMC: 7813473. DOI: 10.1016/j.acra.2021.01.016. View

12.

Cadario R, Longoni C, Morewedge C . Understanding, explaining, and utilizing medical artificial intelligence. Nat Hum Behav. 2021; 5(12):1636-1642. DOI: 10.1038/s41562-021-01146-0. View

13.

Vollmuth P, Foltyn M, Huang R, Galldiks N, Petersen J, Isensee F . Artificial intelligence (AI)-based decision support improves reproducibility of tumor response assessment in neuro-oncology: An international multi-reader study. Neuro Oncol. 2022; 25(3):533-543. PMC: 10013635. DOI: 10.1093/neuonc/noac189. View

14.

Bi N, Wang J, Zhang T, Chen X, Xia W, Miao J . Deep Learning Improved Clinical Target Volume Contouring Quality and Efficiency for Postoperative Radiation Therapy in Non-small Cell Lung Cancer. Front Oncol. 2019; 9:1192. PMC: 6863957. DOI: 10.3389/fonc.2019.01192. View

15.

Cha E, Elguindi S, Onochie I, Gorovets D, Deasy J, Zelefsky M . Clinical implementation of deep learning contour autosegmentation for prostate radiotherapy. Radiother Oncol. 2021; 159:1-7. PMC: 9444280. DOI: 10.1016/j.radonc.2021.02.040. View

16.

Chang K, Beers A, Bai H, Brown J, Ly K, Li X . Automatic assessment of glioma burden: a deep learning algorithm for fully automated volumetric and bidimensional measurement. Neuro Oncol. 2019; 21(11):1412-1422. PMC: 6827825. DOI: 10.1093/neuonc/noz106. View

17.

Lo A, Liu M, Chan E, Lund C, Truong P, Loewen S . The impact of peer review of volume delineation in stereotactic body radiation therapy planning for primary lung cancer: a multicenter quality assurance study. J Thorac Oncol. 2014; 9(4):527-33. DOI: 10.1097/JTO.0000000000000119. View

18.

Nikolov S, Blackwell S, Zverovitch A, Mendes R, Livne M, De Fauw J . Clinically Applicable Segmentation of Head and Neck Anatomy for Radiotherapy: Deep Learning Algorithm Development and Validation Study. J Med Internet Res. 2021; 23(7):e26151. PMC: 8314151. DOI: 10.2196/26151. View

19.

Jha A, Myers K, Obuchowski N, Liu Z, Rahman M, Saboury B . Objective Task-Based Evaluation of Artificial Intelligence-Based Medical Imaging Methods:: Framework, Strategies, and Role of the Physician. PET Clin. 2021; 16(4):493-511. DOI: 10.1016/j.cpet.2021.06.013. View

20.

Warfield S, Zou K, Wells W . Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation. IEEE Trans Med Imaging. 2004; 23(7):903-21. PMC: 1283110. DOI: 10.1109/TMI.2004.828354. View