» Articles » PMID: 37781685

Learning from Multiple Annotators for Medical Image Segmentation

Overview
Publisher Elsevier
Date 2023 Oct 2
PMID 37781685
Authors
Affiliations
Soon will be listed here.
Abstract

Supervised machine learning methods have been widely developed for segmentation tasks in recent years. However, the quality of labels has high impact on the predictive performance of these algorithms. This issue is particularly acute in the medical image domain, where both the cost of annotation and the inter-observer variability are high. Different human experts contribute estimates of the "actual" segmentation labels in a typical label acquisition process, influenced by their personal biases and competency levels. The performance of automatic segmentation algorithms is limited when these noisy labels are used as the expert consensus label. In this work, we use two coupled CNNs to jointly learn, from purely noisy observations alone, the reliability of individual annotators and the expert consensus label distributions. The separation of the two is achieved by maximally describing the annotator's "unreliable behavior" (we call it "maximally unreliable") while achieving high fidelity with the noisy training data. We first create a toy segmentation dataset using MNIST and investigate the properties of the proposed algorithm. We then use three public medical imaging segmentation datasets to demonstrate our method's efficacy, including both simulated (where necessary) and real-world annotations: 1) ISBI2015 (multiple-sclerosis lesions); 2) BraTS (brain tumors); 3) LIDC-IDRI (lung abnormalities). Finally, we create a real-world multiple sclerosis lesion dataset (QSMSC at UCL: Queen Square Multiple Sclerosis Center at UCL, UK) with manual segmentations from 4 different annotators (3 radiologists with different level skills and 1 expert to generate the expert consensus label). In all datasets, our method consistently outperforms competing methods and relevant baselines, especially when the number of annotations is small and the amount of disagreement is large. The studies also reveal that the system is capable of capturing the complicated spatial characteristics of annotators' mistakes.

Citing Articles

Stacking Model-Based Classifiers for Dealing With Multiple Sets of Noisy Labels.

Montani G, Cappozzo A Biom J. 2025; 67(2):e70042.

PMID: 40071867 PMC: 11898607. DOI: 10.1002/bimj.70042.


Evaluation of artificial intelligence-based autosegmentation for a high-performance cone-beam computed tomography imaging system in the pelvic region.

Sluijter J, van de Schoot A, Yaakoubi A, de Jong M, van der Knaap-van Dongen M, Kunnen B Phys Imaging Radiat Oncol. 2025; 33():100687.

PMID: 39802649 PMC: 11721864. DOI: 10.1016/j.phro.2024.100687.


RapidBrachyIVBT: A dosimetry software for patient-specific intravascular brachytherapy dose calculations on optical coherence tomography images.

Rahbaran M, Kalinowski J, DeCunha J, Croce K, Bergmark B, Tsui J Med Phys. 2024; 52(2):1256-1267.

PMID: 39561213 PMC: 11788245. DOI: 10.1002/mp.17525.


Advancing image segmentation with DBO-Otsu: Addressing rubber tree diseases through enhanced threshold techniques.

Xie Z, Wu J, Tang W, Liu Y PLoS One. 2024; 19(3):e0297284.

PMID: 38512907 PMC: 10956860. DOI: 10.1371/journal.pone.0297284.

References
1.
Warfield S, Zou K, Wells W . Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation. IEEE Trans Med Imaging. 2004; 23(7):903-21. PMC: 1283110. DOI: 10.1109/TMI.2004.828354. View

2.
Winzeck S, Hakim A, McKinley R, Pinto J, Alves V, Silva C . ISLES 2016 and 2017-Benchmarking Ischemic Stroke Lesion Outcome Prediction Based on Multispectral MRI. Front Neurol. 2018; 9:679. PMC: 6146088. DOI: 10.3389/fneur.2018.00679. View

3.
Menze B, Jakab A, Bauer S, Kalpathy-Cramer J, Farahani K, Kirby J . The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS). IEEE Trans Med Imaging. 2014; 34(10):1993-2024. PMC: 4833122. DOI: 10.1109/TMI.2014.2377694. View

4.
Carass A, Roy S, Jog A, Cuzzocreo J, Magrath E, Gherman A . Longitudinal multiple sclerosis lesion segmentation: Resource and challenge. Neuroimage. 2017; 148:77-102. PMC: 5344762. DOI: 10.1016/j.neuroimage.2016.12.064. View

5.
Goceri E . Diagnosis of skin diseases in the era of deep learning and mobile technology. Comput Biol Med. 2021; 134:104458. DOI: 10.1016/j.compbiomed.2021.104458. View