Loading...
Advancing Out-of-Distribution Generalization in Medical Imaging
Rahman, Umaima
Rahman, Umaima
Author
Supervisor
Department
Computer Vision
Embargo End Date
2025-05-30
Type
Dissertation
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Medical imaging plays a crucial role in disease diagnosis and clinical decisionmaking, yet deep learning models often struggle with distribution shifts, limited labeled data, and spurious correlations, reducing their reliability in realworld settings. These challenges arise from variations in imaging protocols, equipment, and patient demographics, making outofdistribution (OOD) generalization a critical research problem. This thesis proposes novel methodologies to enhance robustness, transferability, and generalization in medical image analysis, addressing the limitations of existing models. We first introduce MRIShift, a disentangled representation learning framework for 3D MRI lesion segmentation under domain shifts. By separating content and style features, MRIShift ensures consistent segmentation across diverse datasets, preserving anatomical structures while adapting to domainspecific variations. Extending this idea beyond segmentation, we propose Di MPLe (Disentangled Multi Modal Prompt Learning), which improves visionlanguage model (VLM) generalization by explicitly disentangling invariant and spurious features across modalities. Unlike existing approaches that treat visual and textual embeddings holistically, Di MPLe mini mizes the impact of dataset biases, leading to better OOD robustness and zeroshot classification performance. To further address data scarcity in medical imaging, we propose Med Un A (Medical Unsu pervised Adaptation of Vision Language Models), which leverages unpaired images and texts using contrastive alignment. Med Un A eliminates the need for labeled training data, instead utilizing learned text representations to enhance classification accuracy in lowresource clinical settings. Finally, we explore crossdisease transferability (XDT), enabling models trained on one disease to generalize to novel diseases in a zeroshot setting. By leveraging shared visual patterns among diseases within the same organ, our XDT framework provides a scalable and costeffective diagnostic tool for settings with limited expert annotations. The proposed methods are evaluated across a diverse set of medical datasets, including Shifts 2.0 MSWML, IDRID, ISIC, Shenzhen TB, Montgomery TB, Guangzhou Pneumonia, and Med IMeta, covering multiple imaging modalities and disease types. Performance is assessed using metrics such as Accuracy, Dice score, Area Under the retention Curve (RAUC), and macro F1 score to comprehensively evaluate robustness, accuracy, and model calibration across both indistribution and OOD settings. Overall, this thesis presents a cohesive set of methodologies that improve the robustness, adaptability, and generalization of AI models for medical imaging. By integrating disentangled representation learning, multimodal prompt learning, and visionlanguage adaptation, we develop scalable, labelefficient solutions that enhance diagnostic reliability, particularly in resource constrained environments. These contributions not only push the boundaries of stateoftheart in medical imaging but also pave the way for meaningful impact in realworld clinical practice.
Citation
Umaima Rahman, “Advancing Out-of-Distribution Generalization in Medical Imaging,” Doctor of Philosophy thesis, Computer Vision, MBZUAI, 2025.
Source
Conference
Keywords
Out-of-Distribution Generalization, Medical Image Analysis, Vision-Language Models, AI for Healthcare, Domain Generalization
