TMNet: A Two-Branch Multi-Scale Semantic Segmentation Network for Remote Sensing Images

Overview

Journal Sensors (Basel)

Publisher MDPI

Specialty Biotechnology

Date 2023 Jul 14

PMID 37447759

Authors

Yupeng Gao

Shengwei Zhang

Dongshi Zuo

Weihong Yan

Xin Pan

Affiliations

Soon will be listed here.

Abstract

Pixel-level information of remote sensing images is of great value in many fields. CNN has a strong ability to extract image backbone features, but due to the localization of convolution operation, it is challenging to directly obtain global feature information and contextual semantic interaction, which makes it difficult for a pure CNN model to obtain higher precision results in semantic segmentation of remote sensing images. Inspired by the Swin Transformer with global feature coding capability, we design a two-branch multi-scale semantic segmentation network (TMNet) for remote sensing images. The network adopts the structure of a double encoder and a decoder. The Swin Transformer is used to increase the ability to extract global feature information. A multi-scale feature fusion module (MFM) is designed to merge shallow spatial features from images of different scales into deep features. In addition, the feature enhancement module (FEM) and channel enhancement module (CEM) are proposed and added to the dual encoder to enhance the feature extraction. Experiments were conducted on the WHDLD and Potsdam datasets to verify the excellent performance of TMNet.

Citing Articles

MFMamba: A Mamba-Based Multi-Modal Fusion Network for Semantic Segmentation of Remote Sensing Images.

Wang Y, Cao L, Deng H Sensors (Basel). 2024; 24(22).

PMID: 39599043 PMC: 11598657. DOI: 10.3390/s24227266.

Artificial intelligence automatic measurement technology of lumbosacral radiographic parameters.

Yuan S, Chen R, Liu X, Wang T, Wang A, Fan N Front Bioeng Biotechnol. 2024; 12:1404058.

PMID: 39011157 PMC: 11246908. DOI: 10.3389/fbioe.2024.1404058.

References

Wang J, Sun K, Cheng T, Jiang B, Deng C, Zhao Y . Deep High-Resolution Representation Learning for Visual Recognition. IEEE Trans Pattern Anal Mach Intell. 2020; 43(10):3349-3364. DOI: 10.1109/TPAMI.2020.2983686. View

Xu R, Wang C, Zhang J, Xu S, Meng W, Zhang X . RSSFormer: Foreground Saliency Enhancement for Remote Sensing Land-Cover Segmentation. IEEE Trans Image Process. 2023; PP. DOI: 10.1109/TIP.2023.3238648. View

Wu L, Fang L, Yue J, Zhang B, Ghamisi P, He M . Deep Bilateral Filtering Network for Point-Supervised Semantic Segmentation in Remote Sensing Images. IEEE Trans Image Process. 2022; 31:7419-7434. DOI: 10.1109/TIP.2022.3222904. View

Wang W, Chen W, Qiu Q, Chen L, Wu B, Lin B . CrossFormer++: A Versatile Vision Transformer Hinging on Cross-Scale Attention. IEEE Trans Pattern Anal Mach Intell. 2023; 46(5):3123-3136. DOI: 10.1109/TPAMI.2023.3341806. View

Badrinarayanan V, Kendall A, Cipolla R . SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans Pattern Anal Mach Intell. 2017; 39(12):2481-2495. DOI: 10.1109/TPAMI.2016.2644615. View