Uncertainty-Aware Sign Language Video Retrieval with Probability Distribution Modeling
Wu, Xuan ; Li, Hongxiang ; Luo, Yuanjiang ; Cheng, Xuxin ; Zhuang, Xianwei ; Cao, Meng ; Fu, Keren
Wu, Xuan
Li, Hongxiang
Luo, Yuanjiang
Cheng, Xuxin
Zhuang, Xianwei
Cao, Meng
Fu, Keren
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Sign language video retrieval plays a key role in facilitating information access for the deaf community. Despite significant advances in video-text retrieval, the complexity and inherent uncertainty of sign language preclude direct applications of these techniques. Previous methods achieve mapping between sign language videos and text through fine-grained modal alignment. However, due to the scarcity of fine-grained annotations, the uncertainty inherent in sign language videos is underestimated, limiting further development of sign language retrieval tasks. To address this challenge, we propose a new Uncertainty-aware Probability Distribution Retrieval (UPRet), which conceptualizes the mapping process of sign language videos and texts in terms of probability distributions, explores their potential interrelationships, and enables flexible mappings. Experiments on three benchmarks demonstrate the effectiveness of our method, which achieves state-of-the-art results on How2Sign (59.1%), PHOENIX-2014T (72.0%), and CSL-Daily (78.4%). Our source code is available: https://github.com/xua222/UPRet. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
Citation
X. Wu et al., “Uncertainty-Aware Sign Language Video Retrieval with Probability Distribution Modeling,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 15102, pp. 390–408, Jan. 2025, doi: 10.1007/978-3-031-72784-9_22.
Source
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Conference
European Conference on Computer Vision (ECCV)
Keywords
Probabilistic representations, Sign language video retrieval, Text-video retrieval
Subjects
Source
European Conference on Computer Vision (ECCV)
Publisher
Springer Nature
