Similarity Regulation and Calibration Alignment for Weakly Supervised Text-Based Person Re-Identification
Fu, Ao ; Zhao, Jiaqi ; Zhou, Yong ; Du, Wenliang ; Yao, Rui ; El Saddik, Abdulmotaleb
Fu, Ao
Zhao, Jiaqi
Zhou, Yong
Du, Wenliang
Yao, Rui
El Saddik, Abdulmotaleb
Supervisor
Department
Computer Vision
Embargo End Date
Type
Journal article
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Traditional text-based person re-identification relies on identity labels. However, it is impossible to annotate large datasets, since identity annotation is expensive and time-consuming. Weakly supervised text-based person re-identification, where only text–image pairs are available without annotation of identities, is very practical in real life. While dealing with the weakly supervised person re-identification, two issues should be strengthed, i.e., alignment caused by different modal, and cross-modal matching ambiguity caused by the lack of identity labels. In this article, we propose a similarity regulation and calibration alignment (SRCA) framework, which consists of two unimodal encoders for images and text, respectively, and a multi-modal encoder for the masked language modeling task. First, a similarity regulation (SR) strategy is proposed to relax the strict one-to-one constraints for the local similarities between different pairs by introducing a novel soft objective. The soft objective can adjust hard objectives to achieve soft cross-modal alignment by establishing a many-to-many relationship between two modalities. Second, the calibration alignment (CA) module is proposed to improve intra-class compactness by modeling pseudo-label assignment as optimal transport. The ambiguity of cross-modal matching can be reduced by aligning features and pseudo-labels of different modalities and gradually calibrating the distribution of pseudo-labels. Experimental results show that our method has achieved obvious advantages compared with existing methods and also demonstrated competitive performance compared with fully supervised methods.
Citation
A. Fu, J. Zhao, Y. Zhou, W. Du, R. Yao, and A. El Saddik, “Similarity Regulation and Calibration Alignment for Weakly Supervised Text-Based Person Re-Identification,” ACM Transactions on Multimedia Computing, Communications and Applications, vol. 21, no. 3, pp. 1–19, Mar. 2025, doi: 10.1145/3711861.
Source
ACM Transactions on Multimedia Computing, Communications and Applications
Conference
Keywords
Person Re-Identification, Cross-modal, Weakly Supervised
Subjects
Source
Publisher
Association for Computing Machinery
