NTRENet++: Unleashing the Power of Non-Target Knowledge for Few-Shot Semantic Segmentation
Liu, Yuanwei ; Liu, Nian ; Wu, Yi ; Cholakkal, Hisham ; Anwer, Rao Muhammad ; Yao, Xiwen ; Han, Junwei
Liu, Yuanwei
Liu, Nian
Wu, Yi
Cholakkal, Hisham
Anwer, Rao Muhammad
Yao, Xiwen
Han, Junwei
Supervisor
Department
Computer Vision
Embargo End Date
Type
Journal article
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Few-shot semantic segmentation (FSS) aims to segment the target object under the condition of a few annotated samples. However, current studies on FSS primarily concentrate on extracting information related to the object, resulting in inadequate identification of ambiguous regions, particularly in non-target areas, including the background (BG) and Distracting Objects (DOs). Intuitively, to alleviate this problem, we propose a novel framework, namely NTRENet++, to explicitly mine and eliminate BG and DO regions in the query. First, we introduce a BG Mining Module (BGMM) to extract BG information and generate a comprehensive BG prototype from all images. For this purpose, a BG mining loss is formulated to supervise the learning of BGMM, utilizing only the known target object segmentation ground truth. Subsequently, based on this BG prototype, we employ a BG Eliminating Module to filter out the BG information from the query and obtain a BG-free result. Following this, the target information is utilized in the target matching module to generate the initial segmentation result. Finally, a DO Eliminating Module is proposed to further mine and eliminate DO regions, based on which we can obtain a BG and DO-free target object segmentation result. Moreover, we present a prototypical-pixel contrastive learning algorithm to enhance the model’s capability to differentiate the target object from DOs. Extensive experiments conducted on both PASCAL-5i and COCO-20i datasets demonstrate the effectiveness of our approach despite its simplicity. Additionally, we extend our method to the few-shot video object segmentation task and achieve improved performance on a baseline model, demonstrating its generalization ability. Code is available at https://github.com/LIUYUANWEI98/NTRENet++.
Citation
Y. Liu et al., "NTRENet++: Unleashing the Power of Non-Target Knowledge for Few-Shot Semantic Segmentation," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 5, pp. 4314-4328, May 2025, doi: 10.1109/TCSVT.2024.3519573
Source
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
Conference
Keywords
Few-shot learning, Few-shot segmentation, Semantic segmentation, Video object segmentation
Subjects
Source
Publisher
IEEE
