MedMask: A Self-supervised Vision Foundation Model for Breast Cancer Detection Using Mammograms
Ashraf, Tajamul ; Khan, Ufaq Jeelani ; Xie, Yutong ; Bashir, J.
Ashraf, Tajamul
Khan, Ufaq Jeelani
Xie, Yutong
Bashir, J.
Author
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
2026
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
A major challenge in medical imaging is the limited availability of large, well-annotated datasets. In the case of breast cancer detection from mammograms (BCDM), obtaining precise bounding box annotations for regions of interest is costly and labor-intensive. However, large collections of unannotated mammograms are often readily available. Motivated by this observation, we propose a self-supervised fine-tuning framework for BCDM . Traditional object detection models, designed for natural images with abundant object presence, struggle in medical imaging due to limited annotated data. To tackle this challenge, we introduce MedMask, a novel self-supervised framework that leverages masked autoencoders (MAE) with vision foundation models (VFMs) in a transformer-based architecture. We propose a customized MAE module that utilizes the transformer’s encoder and an auxiliary decoder to mask and reconstruct multi-scale feature maps, enabling efficient learning from limited annotations while capturing domain-specific features. Additionally, we leverage the zero-shot capabilities of VFMs with a proposed expert contrastive knowledge distillation technique to learn better representations. Our approach outperforms the state-of-the-art on the publicly available INBreast and DDSM datasets, achieving significant sensitivity improvements of 22% and 17%, respectively. Also we achieved 27% improvement on RSNA-BSD1K dataset. Code is available at https://github.com/Tajamul21/MedMask.
Citation
T. Ashraf, S. Salmani, M. Peerzada, U. Khan, Y. Xie, and J. Bashir, “MedMask: A Self-supervised Vision Foundation Model for Breast Cancer Detection Using Mammograms,” pp. 340–350, 2026, doi: 10.1007/978-3-032-05559-0_34
Source
Lecture Notes in Computer Science
Conference
2nd Deep Breast Workshop on Artificial Intelligence and Imaging for Diagnostic and Treatment Challenges in Breast Care, Deep-Breath 2025, held in conjunction with the 28th International Conference on Medical Imaging and Computer-Assisted Intervention, MICCAI 2025
Keywords
Breast Cancer Detection, Mask Autoencoders, Medical Imaging, Vision Foundational Models
Subjects
Source
2nd Deep Breast Workshop on Artificial Intelligence and Imaging for Diagnostic and Treatment Challenges in Breast Care, Deep-Breath 2025, held in conjunction with the 28th International Conference on Medical Imaging and Computer-Assisted Intervention, MICCAI 2025
Publisher
Springer Nature
