Loading...
Thumbnail Image
Item

Enhancing Medical Image Segmentation with Novel Architectures and Learning Strategies

Kareem, Daniya Najiha Abdul
Department
Computer Vision
Embargo End Date
2025-05-30
Type
Dissertation
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
In the past decade, artificial intelligence (AI)-based healthcare solutions have achieved impressive advancements, largely driven by the development of deep learning models tailored for medical data analysis. Among various medical data analysis tasks, medical image segmentation is considered a complex task aimed at accurately identifying and delineating the pixels or voxels associated with specific target classes. The intricacies of medical images—including inter-modality and intra-modality variations, along with issues such as data scarcity and imbalance—pose significant hurdles. Consequently, many state-of-the-art segmentation approaches developed for natural images often struggle to provide accurate segmentation predictions on medical image data. This thesis addresses several challenges in the domain of medical image segmentation through multiple distinct contributions. The architectural contributions aim to improve segmentation quality in volumetric data, as well as to enhance computational efficiency. Furthermore, advanced learning strategies are developed to address data-related issues such as class imbalance, scarcity of labelled data, and the appearance of novel object classes during inference. At first, the thesis investigate architectural design choices to enhance the performance of deep learning-based volumetric segmentation methods, with a focus on improving the segmentation quality of elongated structures and object boundaries. By exploring the strengths of convolutional neural networks (CNNs), transformers, and multi-layer perceptron (MLP) mixers, a novel volumetric MLP-mixer-based hybrid architecture is introduced to capture global features more effectively and improve boundary prediction in complex 3D volumetric datasets. Then, the thesis propose DwinFormer, a CNN-transformer hybrid framework featuring novel contributions to expand the receptive field beyond conventional windowed attention approaches (e.g., Swin attention), while remaining computationally efficient compared to full self-attention mechanisms. In this architecture, we introduce the Nested Dwin Attention (NDA) layer, enabling attention computation with global receptive fields along the horizontal, vertical, and depthwise directions, as well as the Convolutional Dwin Attention (CDA) layer for global contextual feature extraction. As a third contribution, this thesis investigates the development of segmentation architectures that strike an optimal balance between computational complexity and segmentation performance. To this end, we present InceptionMamba, a lightweight network that leverages convolutions in conjunction with the Mamba block within a novel InceptionMamba Module (IMM) to capture diverse contextual information. By incorporating a state-space model and employing a lightweight decoder, we achieve state-of-the-art performance on a range of 2D medical and microscopic image segmentation datasets, while significantly reducing computational cost. Finally, this thesis addresses real-world data and learning challenges in medical image segmentation. In practice, medical datasets are often small, imbalanced, and may include unknown target categories not seen during training. Effectively distinguishing such unknown categories while accurately recognising previously trained categories is crucial for robust segmentation. To this end, we propose a semi-supervised open-set learning approach for medical image classification and segmentation, where we also introduce novel feature regularisation and weight normalisation strategies to mitigate class imbalance in medical datasets. To summarise, the multiple interconnected contributions of this thesis aim to tackle various challenges in both 2D and 3D medical and microscopic image segmentation, thereby advancing the application of deep learning methods in this important domain.
Citation
Daniya Najiha Abdul Kareem, “Enhancing Medical Image Segmentation with Novel Architectures and Learning Strategies,” Doctor of Philosophy thesis, Computer Vision, MBZUAI, 2025.
Source
Conference
Keywords
Segmentation, Volumetric data, Data-Imbalance, Data Scarcity, Semi-supervised Learning
Subjects
Source
Publisher
DOI
Full-text link