iSeg: An Iterative Refinement-based Framework for Training-free Segmentation
Sun, Lin ; Cao, Jiale ; Xie, Jin ; Khan, Fahad Shahbaz ; Pang, Yanwei
Sun, Lin
Cao, Jiale
Xie, Jin
Khan, Fahad Shahbaz
Pang, Yanwei
Supervisor
Department
Computer Vision
Embargo End Date
Type
Journal article
Date
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Stable Diffusion has demonstrated strong image synthesis ability to given text descriptions, suggesting it to contain strong semantic clue for grouping objects. The researchers have explored employing Stable Diffusion for training-free segmentation. Most existing approaches refine cross-attention map by self-attention map once, demonstrating that self-attention map contains useful semantic information to improve segmentation. To fully utilize self-attention map, we present a deep experimental analysis on iteratively refining cross-attention map with self-attention map, and propose an effective iterative refinement framework for training-free segmentation, named iSeg. Our iSeg introduces an entropy-reduced self-attention module that utilizes a gradient descent scheme to reduce the entropy of self-attention map, thereby suppressing the weak responses corresponding to irrelevant global information. Leveraging the entropy-reduced self-attention module, our iSeg stably improves cross-attention map with iterative refinement. Further, we design a category-enhanced cross-attention module to generate accurate cross-attention map, providing a better initial input for iterative refinement. Extensive experiments across different datasets and diverse segmentation tasks (weakly-supervised semantic segmentation, open-vocabulary semantic segmentation, unsupervised segmentation, and mask generation on synthetic dataset) reveal the merits of proposed contributions, leading to promising performance. For unsupervised semantic segmentation on Cityscapes, our iSeg achieves an absolute gain of $3.8\%$ in terms of mIoU compared to the best existing training-free approach in literature. Moreover, our proposed iSeg can support segmentation with different kinds of images and interactions, and also be used as a post-processing, or in different frameworks, to improve training-free segmentation. The project is available at https://linsun449.github.io/iSeg.
Citation
L. Sun, J. Cao, J. Xie, F.S. Khan, Y. Pang, "iSeg: An Iterative Refinement-based Framework for Training-free Segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PP, no. 99, pp. 1-17, 2026, https://doi.org/10.1109/tpami.2026.3681368.
Source
IEEE Transactions on Pattern Analysis and Machine Intelligence
Conference
Keywords
46 Information and Computing Sciences, 4611 Machine Learning
Subjects
Source
Publisher
IEEE
