Vision Language Models for Dynamic Human Activity Recognition in Healthcare Settings
Abid, Abderrazek ; Ho, Thanhcong ; Karray, Fakhry
Abid, Abderrazek
Ho, Thanhcong
Karray, Fakhry
Supervisor
Department
Machine Learning
Embargo End Date
Type
Conference proceeding
Date
2026
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
As generative AI continues to evolve, Vision Language Models (VLMs) have emerged as promising tools in various healthcare applications. One area that remains relatively underexplored is their use in human activity recognition (HAR) for remote health monitoring. VLMs offer notable strengths, including greater flexibility and the ability to overcome some of the constraints of traditional deep learning models. However, a key challenge in applying VLMs to HAR lies in the difficulty of evaluating their dynamic and often non-deterministic outputs. To address this gap, we introduce a descriptive caption data set and propose comprehensive evaluation methods to evaluate VLMs in HAR. Through comparative experiments with state-of-the-art deep learning models, our findings demonstrate that VLMs achieve comparable performance and, in some cases, even surpass conventional approaches in terms of accuracy. This work contributes a strong benchmark and opens new possibilities for the integration of VLMs into intelligent healthcare systems. Code and dataset are available at: https://github.com/gouga10/VLMs-HAR-RHMS.git.
Citation
A. Abid, T. C. Ho, and F. Karray, “Vision Language Models for Dynamic Human Activity Recognition in Healthcare Settings,” Lecture Notes in Computer Science, vol. 16051 LNCS, pp. 35–45, 2026, doi: 10.1007/978-3-032-08452-1_4
Source
Lecture Notes in Computer Science
Conference
12th International Work-Conference on Bioinformatics and Biomedical Engineering, IWBBIO 2025
Keywords
Generative AI, Human Activity Recognition, Remote Health Monitoring, Vision Language Models
Subjects
Source
12th International Work-Conference on Bioinformatics and Biomedical Engineering, IWBBIO 2025
Publisher
SpringerNature
