Infant Cry Detection Using Causal Temporal Representation
Fu, Minghao ; Li, Danning ; Gadhiya, Aryan ; Lambright, Benjamin ; Alowais, Mohamed ; Bahnassy, Mohab ; Elletter, Saad El Dine ; Toyin, Hawau Olamide ; Jiang, Haiyan ; Zhang, Kun ... show 1 more
Fu, Minghao
Li, Danning
Gadhiya, Aryan
Lambright, Benjamin
Alowais, Mohamed
Bahnassy, Mohab
Elletter, Saad El Dine
Toyin, Hawau Olamide
Jiang, Haiyan
Zhang, Kun
Supervisor
Department
Natural Language Processing
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
This paper addresses a major challenge in acoustic event detection, in particular infant cry detection in the presence of other sounds and background noises: the lack of precise annotated data. We present two contributions for supervised and unsupervised infant cry detection. The first is an annotated dataset for cry segmentation, which enables supervised models to achieve state-of-the-art performance. Additionally, we propose a novel unsupervised method, Causal Representation Spare Transition Clustering (CRSTC), based on causal temporal representation, which helps address the issue of data scarcity more generally. By integrating the detected cry segments, we significantly improve the performance of downstream infant cry classification, highlighting the potential of this approach for infant care applications.
Citation
M. Fu et al., "Infant Cry Detection Using Causal Temporal Representation," ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 2025, pp. 1-5, doi: 10.1109/ICASSP49660.2025.10890051.
Source
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Conference
Keywords
Pediatrics, Accuracy, Event Detection, Signal Processing, Acoustics, Background Noise, Speech Processing, Unsupervised Learning
Subjects
Source
Publisher
IEEE
