HCTNet: A Hybrid ConvNet-Transformer Network for Retinal Optical Coherence Tomography Image Classification
Overview
Affiliations
Automatic and accurate optical coherence tomography (OCT) image classification is of great significance to computer-assisted diagnosis of retinal disease. In this study, we propose a hybrid ConvNet-Transformer network (HCTNet) and verify the feasibility of a Transformer-based method for retinal OCT image classification. The HCTNet first utilizes a low-level feature extraction module based on the residual dense block to generate low-level features for facilitating the network training. Then, two parallel branches of the Transformer and the ConvNet are designed to exploit the global and local context of the OCT images. Finally, a feature fusion module based on an adaptive re-weighting mechanism is employed to combine the extracted global and local features for predicting the category of OCT images in the testing datasets. The HCTNet combines the advantage of the convolutional neural network in extracting local features and the advantage of the vision Transformer in establishing long-range dependencies. A verification on two public retinal OCT datasets shows that our HCTNet method achieves an overall accuracy of 91.56% and 86.18%, respectively, outperforming the pure ViT and several ConvNet-based classification methods.
Discriminative, generative artificial intelligence, and foundation models in retina imaging.
Ruamviboonsuk P, Arjkongharn N, Vongsa N, Pakaymaskul P, Kaothanthong N Taiwan J Ophthalmol. 2025; 14(4):473-485.
PMID: 39803410 PMC: 11717344. DOI: 10.4103/tjo.TJO-D-24-00064.
Multiscale attention-over-attention network for retinal disease recognition in OCT radiology images.
Alenezi A, Aloqalaa D, Singh S, Alrabiah R, Habib S, Islam M Front Med (Lausanne). 2024; 11:1499393.
PMID: 39582968 PMC: 11583944. DOI: 10.3389/fmed.2024.1499393.
L2NLF: a novel linear-to-nonlinear framework for multi-modal medical image registration.
Deng L, Zou Y, Yang X, Wang J, Huang S Biomed Eng Lett. 2024; 14(3):497-509.
PMID: 38645595 PMC: 11026354. DOI: 10.1007/s13534-023-00344-1.
Multi-Scale-Denoising Residual Convolutional Network for Retinal Disease Classification Using OCT.
Peng J, Lu J, Zhuo J, Li P Sensors (Basel). 2024; 24(1).
PMID: 38203011 PMC: 10781341. DOI: 10.3390/s24010150.
Vision transformers: The next frontier for deep learning-based ophthalmic image analysis.
Wu J, Koseoglu N, Jones C, Liu T Saudi J Ophthalmol. 2023; 37(3):173-178.
PMID: 38074310 PMC: 10701151. DOI: 10.4103/sjopt.sjopt_91_23.