» Articles » PMID: 32491986

Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks

Overview
Date 2020 Jun 4
PMID 32491986
Citations 19
Authors
Affiliations
Soon will be listed here.
Abstract

The use of RGB-D information for salient object detection (SOD) has been extensively explored in recent years. However, relatively few efforts have been put toward modeling SOD in real-world human activity scenes with RGB-D. In this article, we fill the gap by making the following contributions to RGB-D SOD: 1) we carefully collect a new Salient Person (SIP) data set that consists of ~1 K high-resolution images that cover diverse real-world scenes from various viewpoints, poses, occlusions, illuminations, and background s; 2) we conduct a large-scale (and, so far, the most comprehensive) benchmark comparing contemporary methods, which has long been missing in the field and can serve as a baseline for future research, and we systematically summarize 32 popular models and evaluate 18 parts of 32 models on seven data sets containing a total of about 97k images; and 3) we propose a simple general architecture, called deep depth-depurator network (DNet). It consists of a depth depurator unit (DDU) and a three-stream feature learning module (FLM), which performs low-quality depth map filtering and cross-modal feature learning, respectively. These components form a nested structure and are elaborately designed to be learned jointly. DNet exceeds the performance of any prior contenders across all five metrics under consideration, thus serving as a strong model to advance research in this field. We also demonstrate that DNet can be used to efficiently extract salient object masks from real scenes, enabling effective background-changing application with a speed of 65 frames/s on a single GPU. All the saliency maps, our new SIP data set, the DNet model, and the evaluation tools are publicly available at https://github.com/DengPingFan/D3NetBenchmark.

Citing Articles

LRNet: lightweight attention-oriented residual fusion network for light field salient object detection.

Ma S, Zhu X, Xu L, Zhou L, Chen D Sci Rep. 2024; 14(1):26030.

PMID: 39472603 PMC: 11522285. DOI: 10.1038/s41598-024-76874-0.


SLMSF-Net: A Semantic Localization and Multi-Scale Fusion Network for RGB-D Salient Object Detection.

Peng Y, Zhai Z, Feng M Sensors (Basel). 2024; 24(4).

PMID: 38400274 PMC: 10892948. DOI: 10.3390/s24041117.


Lightweight Cross-Modal Information Mutual Reinforcement Network for RGB-T Salient Object Detection.

Lv C, Wan B, Zhou X, Sun Y, Zhang J, Yan C Entropy (Basel). 2024; 26(2).

PMID: 38392385 PMC: 10888287. DOI: 10.3390/e26020130.


Swin Transformer-Based Edge Guidance Network for RGB-D Salient Object Detection.

Wang S, Jiang F, Xu B Sensors (Basel). 2023; 23(21).

PMID: 37960501 PMC: 10650861. DOI: 10.3390/s23218802.


RGB-D salient object detection via convolutional capsule network based on feature extraction and integration.

Xu K, Guo J Sci Rep. 2023; 13(1):17652.

PMID: 37848501 PMC: 10582015. DOI: 10.1038/s41598-023-44698-z.