Robust Multimodal Learning for Ophthalmic Disease Grading via Disentangled Representation
Wang, Xinkun ; Wang, Yifang ; Liang, Senwei ; Tang, Feilong ; Liu, Chengzhi ; Hu, Ming ; Hu, Chao ; He, Junjun ; Ge, Zongyuan ; Razzak, Imran
Wang, Xinkun
Wang, Yifang
Liang, Senwei
Tang, Feilong
Liu, Chengzhi
Hu, Ming
Hu, Chao
He, Junjun
Ge, Zongyuan
Razzak, Imran
Supervisor
Department
Computational Biology
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Ophthalmologists often rely on multimodal data to improve diagnostic precision. However, data on complete modalities are rare in real applications due to a lack of medical equipment and data privacy concerns. Traditional deep learning approaches usually solve these problems by learning representations in latent space. However, we highlight two critical limitations of these current approaches: (i) Task-irrelevant redundant information existing in complex modalities (e.g., massive slices) leads to a significant amount of redundancy in latent space representations. (ii) Overlapping multimodal representations make it challenging to extract features that are unique to each modality. To address these, we introduce the Essence-Point and Disentangle Representation Learning (EDRL) strategy that integrates a self-distillation mechanism into an end-to-end framework to enhance feature selection and disentanglement for robust multimodal learning. Specifically, Essence-Point Representation Learning module selects discriminative features that enhance disease grading performance. Moreover, the Disentangled Representation Learning module separates multimodal data into modality-common and modality-unique representations, reducing feature entanglement and enhancing both robustness and interpretability in ophthalmic disease diagnosis. Experiments on ophthalmology multimodal datasets demonstrate that the proposed EDRL strategy outperforms the state-of-the-art methods significantly. Code is available at GitHub Repository.
Citation
X. Wang et al., “Robust Multimodal Learning for Ophthalmic Disease Grading via Disentangled Representation,” pp. 447–456, 2026, doi: 10.1007/978-3-032-04984-1_43
Source
Medical Image Computing and Computer Assisted Intervention
Conference
28 MICCAI: International Conference on Medical Image Computing and Computer-Assisted Intervention
Keywords
Missing Modality, Multi Modality, Ophthalmic Disease
Subjects
Source
28 MICCAI: International Conference on Medical Image Computing and Computer-Assisted Intervention
Publisher
Springer Nature
