Item

Toward Disentangled and Controllable Deep Metric Learning With Human-Like Concept Decomposition

Chen, Shuhuang
Chen, Shiming
Ye, Shuo
Wang, Yuetian
You, Xinge
Supervisor
Department
Computer Vision
Embargo End Date
Type
Journal article
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Deep metric learning (DML) has shown significant advancements in learning discriminative embeddings for images, playing a crucial role in various vision tasks. However, existing methods typically rely on deep neural networks to extract holistic embeddings, which are challenging to disentangle and interpret. To address this issue, we take inspiration from human cognition, where objects are decomposed into distinct concepts for better understanding. Specifically, we propose the concept metrics network (CMNs) to achieve disentangled and controllable DML. CMN begins by initializing learnable concept vectors to represent various visual concepts. These vectors are then associated with regional visual features via cross-attention mechanism, ensuring each vector corresponds to specific visual properties. Finally, the concept values, determined by their presence in the image, form the output embedding. Comprehensive experiments demonstrate that CMN effectively disentangles visual concepts, with each embedding dimension corresponding to a specific concept. Our method not only outperforms existing state-of-the-art methods in conventional DML application (i.e., image retrieval), but also enables more flexible and controllable application. The code is available at https://github.com/shchen0001/CMN
Citation
S. Chen, S. Chen, S. Ye, Y. Wang and X. You, "Toward Disentangled and Controllable Deep Metric Learning With Human-Like Concept Decomposition," in IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 10, pp. 18628-18641, Oct. 2025, doi: 10.1109/TNNLS.2025.3587907
Source
IEEE Transactions on Neural Networks and Learning Systems
Conference
Keywords
Cross-attention, deep metric learning (DML), vector representation, visual concept
Subjects
Source
Publisher
IEEE
Full-text link