Item

LawDIS: Language-Window-Based Controllable Dichotomous Image Segmentation

Yan, Xinyu
Sun, Meijun
Ji, Ge-Peng
Fan, Deng-Ping
Khan, Salman
Khan, Fahad Shahbaz
Citations
Google Scholar:
Altmetric:
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
License
Language
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
We present LawDIS, a language-window-based controllable dichotomous image segmentation (DIS) framework that produces high-quality object masks. Our framework recasts DIS as an image-conditioned mask generation task within a latent diffusion model, enabling seamless integration of user controls. LawDIS is enhanced with macro-to-micro control modes. Specifically, in macro mode, we introduce a language-controlled segmentation strategy (LS) to generate an initial mask based on user-provided language prompts. In micro mode, a window-controlled refinement strategy (WR) allows flexible refinement of user-defined regions (i.e., size-adjustable windows) within the initial mask. Coordinated by a mode switcher, these modes can operate independently or jointly, making the framework well-suited for high-accuracy, personalised applications. Extensive experiments on the DIS5K benchmark reveal that our LawDIS significantly outperforms 11 cutting-edge methods across all metrics. Notably, compared to the second-best model MVANet, we achieve $F_{\beta}^{\omega}$ gains of 4.6% with both the LS and WR strategies and 3.6% gains with only the LS strategy on DIS-TE. Codes will be made available at https://github.com/XinyuYanTJU/LawDIS.
Citation
X. Yan, M. Sun, G.-P. Ji, D.-P. Fan, S. Khan, F.S. Khan, "LawDIS: Language-Window-Based Controllable Dichotomous Image Segmentation," 2026, pp. 23902-23911.
Source
2025 IEEE/CVF International Conference on Computer Vision (ICCV)
Conference
2025 IEEE/CVF International Conference on Computer Vision (ICCV)
Keywords
46 Information and Computing Sciences, 47 Language, Communication and Culture, 4704 Linguistics
Subjects
Source
2025 IEEE/CVF International Conference on Computer Vision (ICCV)
Publisher
IEEE
Full-text link