Toward Understanding Deep Learning Classification of Anatomic Sites: Lessons from the Development of a CBCT Projection Classifier
Overview
Affiliations
Deep learning (DL) applications strongly depend on the training dataset and convolutional neural network architecture; however, it is unclear how to objectively select such parameters. We investigate the classification performance of different DL models and training schemes for the anatomic classification of cone-beam computed tomography (CBCT) projections. CBCT scans from 1055 patients were collected and manually classified into five anatomic classes and used to develop DL models to predict the anatomic class from single x-ray projections. VGG-16, Xception, and Inception v3 architectures were trained with 75% of the data, and the remaining 25% was used for testing and evaluation. To study the dependence of the classification performance on dataset size, training data was downsampled to various dataset sizes. Gradient-weighted class activation maps (grad-CAM) were generated using the model with highest classification performance, to identify regions with strong influence on CNN decisions. The highest precision and recall values were achieved with VGG-16. One of the best performing combinations was the VGG-16 trained with 90 deg projections (mean class precision = 0.87). The training dataset size could be reduced to of its initial size, without compromising the classification performance. For correctly classified cases, Grad-CAM were more heavily weighted for anatomically relevant regions. It was possible to determine those dependencies with a higher influence on the classification performance of DL models for the studied task. Grad-CAM enabled the identification of possible sources of class confusion.