Advanced Discriminative Co-Saliency and Background Mining Transformer for Co-Salient Object Detection
Li, Long ; Xie, Huichao ; Liu, Nian ; Zhang, Dingwen ; Anwer, Rao Muhammad ; Cholakkal, Hisham ; Han, Junwei
Li, Long
Xie, Huichao
Liu, Nian
Zhang, Dingwen
Anwer, Rao Muhammad
Cholakkal, Hisham
Han, Junwei
Supervisor
Department
Computer Vision
Embargo End Date
Type
Journal article
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Most existing CoSOD models focus solely on extracting co-saliency cues while neglecting explicit exploration of background regions, potentially leading to difficulties in handling interference from complex background areas. To address this, this paper proposes a Discriminative co-saliency and background Mining Transformer framework (DMT) to explicitly mine both co-saliency and background information and effectively model their discriminability. DMT first learns two types of tokens by disjointly extracting co-saliency and background information from segmentation features, then performs discriminability within the segmentation features guided by these well-learned tokens. In the first phase, we propose economic multi-grained correlation modules for efficient detection information extraction, including Region-to-Region (R2R), Contrast-induced Pixel-to-Token (CtP2T), and Co-saliency Token-to-Token (CoT2T) correlation modules. In the subsequent phase, we introduce Token-Guided Feature Refinement (TGFR) modules to enhance discriminability within the segmentation features. To further enhance the discriminative modeling and practicality of DMT, we first upgrade the original TGFR's intra-image modeling approach to an intra-group one, thus proposing Group TGFR (G-TGFR), which is more suitable for the co-saliency task. Subsequently, we designed a Noise Propagation Suppression (NPS) mechanism to apply our model to a more practical open-world scenario, ultimately presenting our extended version, i.e., DMT+O. Extensive experimental results on both conventional CoSOD and open-world CoSOD benchmark datasets demonstrate the effectiveness of our proposed model. The code is available at: https://github.com/dragonlee12345/Advanced-DMT.
Citation
L. Li et al., "Advanced Discriminative Co-Saliency and Background Mining Transformer for Co-Salient Object Detection," in IEEE Transactions on Pattern Analysis and Machine Intelligence, doi: 10.1109/TPAMI.2025.3573054.
Source
IEEE Transactions on Pattern Analysis and Machine Intelligence
Conference
Keywords
Co-salient object detection, Multi-grained correlations, Discriminability modeling, Transformer, Open-world visual recognition
Subjects
Source
Publisher
IEEE
