Item

A Novel Perspective for Multi-Modal Multi-Label Skin Lesion Classification

Zhang, Yuan
Xie, Yutong
Wang, Hu
Avery, Jodie C
Hull, M Louise
Carneiro, Gustavo
Citations
Google Scholar:
Altmetric:
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
The efficacy of deep learning-based Computer-Aided Diagnosis (CAD) methods for skin diseases relies on analyzing multiple data modalities (i.e., clinical+dermoscopic images, and patient metadata) and addressing the challenges of multi-label classification. Current approaches tend to rely on limited multi-modal techniques and treat the multi-label problem as a multiple multi-class problem, overlooking issues related to imbalanced learning and multi-label correlation. This paper introduces the innovative Skin Lesion Classifier, utilizing a Multi-modal Multilabel TransFormer-based model (SkinM2Former). For multi-modal analysis, we introduce the Tri-Modal Cross-attention Transformer (TMCT) that fuses the three image and metadata modalities at various feature levels of a transformer encoder. For multi-label classification, we introduce a multi-head attention (MHA) module to learn multi-label correlations, complemented by an optimisation that handles multi-label and imbalanced learning problems. SkinM2Former achieves a mean average accuracy of 77.27% and a mean diagnostic accuracy of 77.85% on the public Derm7pt dataset, outperforming state-of-the-art (SOTA) methods.
Citation
Y. Zhang, Y. Xie, H. Wang, J. C. Avery, M. L. Hull and G. Carneiro, "A Novel Perspective for Multi-Modal Multi-Label Skin Lesion Classification," 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, AZ, USA, 2025, pp. 3549-3558, doi: 10.1109/WACV61041.2025.00350.
Source
Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
Conference
2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Keywords
Skin lesion, multi-label classification, multi-modal learning
Subjects
Source
2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Publisher
IEEE
Full-text link