Density-aware and Depth-aware Visual Representation for Zero-Shot Object Counting
Nan, Fang ; Tian, Feng ; Zhang, Ni ; Liu, Nian ; Miao, Haonan ; Dai, Guang ; Wang, Mengmeng
Nan, Fang
Tian, Feng
Zhang, Ni
Liu, Nian
Miao, Haonan
Dai, Guang
Wang, Mengmeng
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Previous methods often utilize CLIP semantic classifiers with class names for zero-shot object counting. However, they ignore crucial density and depth knowledge for counting tasks. Thus, we propose a density-aware and depth-aware prompt counting model, which captures density information via learning density-aware prompts based on density-aware contrastive loss and incorporates depth guidance with predefined depth-aware prompts. To facilitate the training process, we design two strategies for standard counting loss and the contrastive loss, where the former prioritizes larger and sparser objects initially, gradually focusing on smaller and denser objects, and the latter adopts coarse-to-fine density learning. Besides, we construct a dataset named LVIS-372 with more real-world scenarios and balanced instance distribution compared to existing ones. Finally, the experimental results demonstrate the effectiveness of our proposed method.
Citation
F. Nan et al., "Density-aware and Depth-aware Visual Representation for Zero-Shot Object Counting," ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 2025, pp. 1-5, doi: 10.1109/ICASSP49660.2025.10889987
Source
ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Conference
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Keywords
Zero-shot, Object Counting, Depth, CLIP
Subjects
Source
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
Publisher
IEEE
