Learning Disentangled Representation for Multi-Modal Time-Series Sensing Signals
Cai, Ruichu ; Jiang, Zhifan ; Zheng, Kaitao ; Li, Zijian ; Chen, Weilin ; Chen, Xuexin ; Shen, Yifan ; Chen, Guangyi ; Hao, Zhifeng ; Zhang, Kun
Cai, Ruichu
Jiang, Zhifan
Zheng, Kaitao
Li, Zijian
Chen, Weilin
Chen, Xuexin
Shen, Yifan
Chen, Guangyi
Hao, Zhifeng
Zhang, Kun
Supervisor
Department
Machine Learning
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Multi-modal time series data is common in web technologies like the Internet of Things (IoT). Existing methods for multi-modal time series representation learning aim to disentangle the modality-shared and modality-specific latent variables. Although achieving notable performances on downstream tasks, they usually assume an orthogonal latent space. However, the modality-specific and modality-shared latent variables might be dependent on real-world scenarios. Therefore, we propose a general generation process, where the modality-shared and modality-specific latent variables are dependent, and further develop a Multi-modAl TEmporal Disentanglement (MATE) model. Specifically, our MATE model is built on a temporally variational inference architecture with the modality-shared and modality-specific prior networks for the disentanglement of latent variables. Furthermore, we establish identifiability results to show that the extracted representation is disentangled. More specifically, we first achieve the subspace identifiability for modality-shared and modality-specific latent variables by leveraging the pairing of multi-modal data. Then we establish the component-wise identifiability of modality-specific latent variables by employing sufficient changes of historical latent variables. Extensive experimental studies on 12 datasets show a general improvement in different downstream tasks, highlighting the effectiveness of our method in real-world scenarios. © 2025 Copyright held by the owner/author(s).
Citation
R. Cai et al., “Learning Disentangled Representation for Multi-Modal Time-Series Sensing Signals,” Proceedings of the ACM on Web Conference 2025, pp. 3247–3266, Apr. 2025, doi: 10.1145/3696410.3714931
Source
WWW 2025 - Proceedings of the ACM Web Conference
Conference
34th ACM Web Conference, WWW 2025
Keywords
Multimodal Time Series, Time Series Representation
Subjects
Source
34th ACM Web Conference, WWW 2025
Publisher
Association for Computing Machinery
