MPromer: A Unified Diffusion-Based Framework for Scalable and Generalizable Multi-Modal Medical Image Segmentation
Ghallabi, Wafa Al ; Dudhane, Akshay ; Zamir, Syed Waqas ; Khan, Salman ; Khan, Fahad Shahbaz
Ghallabi, Wafa Al
Dudhane, Akshay
Zamir, Syed Waqas
Khan, Salman
Khan, Fahad Shahbaz
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
License
Language
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Multi-modal medical image analysis is essential for comprehensive diagnostics; however, existing segmentation models often struggle to generalize across diverse imaging modalities such as MRI, CT, fundus imaging, and colonoscopy. While recent diffusion-based approaches have shown promising results, they typically rely on task-specific training, which limits their scalability and imposes significant computational demands. To address these limitations, we propose MPromer, a robust and adaptable segmentation framework that incorporates multi-scale implicit prompting within a diffusion-based architecture. In contrast to traditional prompt-driven methods, MPromer adapts automatically to various imaging modalities without requiring manually designed prompts or retraining for individual tasks. By integrating prompt-conditioned diffusion processes into an encoder-decoder structure, the model achieves consistent and effective segmentation across a wide range of medical domains. We evaluate MPromer on six benchmark datasets, demonstrating state-of-the-art performance with strong generalization capabilities. In addition to improved segmentation accuracy, MPromer enhances computational efficiency and extends naturally to multi-label segmentation tasks, making it well-suited for complex clinical applications. The framework provides a scalable and efficient solution that minimizes the need for fine-tuning, which is particularly beneficial in resource-constrained medical environments. Our code and models are available at https://github.com/wafaAlghallabi/MPromer.
Citation
W.A. Ghallabi, A. Dudhane, S.W. Zamir, S. Khan, F.S. Khan, "MPromer: A Unified Diffusion-Based Framework for Scalable and Generalizable Multi-Modal Medical Image Segmentation," 2026, pp. 930-938.
Source
2025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
Conference
2025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
Keywords
46 Information and Computing Sciences, 4603 Computer Vision and Multimedia Computation
Subjects
Source
2025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)
Publisher
IEEE
