RespiroDynamics Unveiled: A Groundbreaking Multi-Modal Deep Learning and Spiking Neural Network Framework for Revolutionizing Non-Invasive Lung Health Assessment
Sharshar, Ahmed
Sharshar, Ahmed
Author
Supervisor
Department
Computer Vision
Embargo End Date
2025-05-21
Type
Thesis
Date
2024
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
This thesis investigates non-invasive lung health assessment using deep learning and Spiking Neural Networks (SNNs) to analyze thermal and RGB video data. Traditional respiratory diagnostics often require direct physical interaction, which can cause patient discomfort. This research aims to develop a non-contact, robust, precise, and energy-efficient lung health model using thermal or mobile video recordings and personal data, eliminating the need for traditional spirometry. The study collected a unique dataset from 60 male participants of various demographics and health backgrounds, including thermal and RGB videos, heart rate, ECG data, and detailed metadata. The methodology involved creating and testing various neural network models, including Convolutional Neural Networks (CNNs) for classification and regression tasks, and SNNs that process temporal respiratory patterns. Innovations include data augmentation, the Adaptive Precision-Tuned Regression (APTR) loss function, multimodal data integration, attention mechanisms, and ensemble learning to enhance model performance. Results revealed high efficacy in both classification and regression tasks. In the FVC Normal vs. Abnormal classification, the thermal model achieved a perfect score of 100%, and the RGB model scored 99.7%. In the Peak Expiratory Flow (PEF) classification, the thermal model outperformed with 97.14% accuracy compared to 96% for RGB. SNNs showed an accuracy improvement from 91.99% to 99.5% after data aggregation for thermal videos, and from 81.17% to 99% for RGB. In regression tasks, ensemble learning significantly boosted performance; the thermal model reported a Relative Root Mean Square Error of 0.11, a Relative Mean Absolute Error of 0.09, and a Pearson Correlation of 0.93. Comparatively, the RGB model showed poorer performance with respective values of 0.26, 0.21, and 0.79. These findings highlight the superior performance of thermal imaging over RGB in detecting respiratory patterns and the beneficial impact of integrating metadata into the models, setting new standards in the field.
Citation
A. Sharshar, "RespiroDynamics Unveiled: A Groundbreaking Multi-Modal Deep Learning and Spiking Neural Network Framework for Revolutionizing Non-Invasive Lung Health Assessment", M.S. Thesis, Computer Vision, MBZUAI, Abu Dhabi, UAE, 2024
