Fusion of Deep Neural Networks, LLMs, and Traditional Machine Learning for Enhanced Clinical Prediction on Tabular and Multimodal Datasets
Cabrera Berobide, Alvaro Maria
Cabrera Berobide, Alvaro Maria
Supervisor
Department
Machine Learning
Embargo End Date
2025-05-30
Type
Thesis
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
The convergence of Machine Learning (ML) and Large Language Models (LLMs) has opened new possibilities in disease risk prediction and personalized medicine. In this thesis, we explore how LLMs, pretrained on extensive textual data, including medical literature, can be effectively integrated with traditional ML methods to enhance the prediction of medical conditions. Using the Human Phenotype Project (HPP) dataset, which spans over 28,000 individuals and includes multi-omics, physiological, and imaging data, we propose a threestage approach. First, we develop methods to convert diverse patient data, ranging from tabular records to timeseries signals and imaging, into structured text and embedding representations. This transformation enables LLMs to leverage their contextual reasoning capabilities and incorporate pretrained domain knowledge for a varied set of tasks. In our case, we will be working with diabetes classification, age estimation, and systolic blood pressure prediction. In the second stage, we create a multimodal LLM architecture to further leverage the strengths of multimodal data. This LLM architecture is designed with four distinct paths: one that processes only text, one that fuses text with multimodal embeddings, one that uses only the multimodal embeddings, and a jointencoding pathway that combines a pooled text representation with raw numeric features. For the multimodal integration, we incorporate a neural network module to refine and process the multimodal embeddings before they are input to the LLM, thereby ensuring optimal integration of heterogeneous data sources. Lastly, in the third stage we extract the multimodal LLM output embeddings that capture highlevel representations of a patient’s data. These embeddings are then processed and subsequently fed into conventional ML models such as XGBoost. This integration allows us to investigate whether LLM-derived features can offer additional insights and boost the performance of established ML techniques. Preliminary experiments confirm that this embeddingbased integration can boost performance in predicting chronic conditions such as diabetes and cardiovascular disease. Moreover, our evaluation indicates that LLM prompting models tend to perform more reliably on classification tasks - such as identifying diabetic patients - than on regression tasks like age estimation or blood pressure prediction, which pose additional challenges in generating precise numerical outputs. The work systematically compares traditional methods (e.g., XGBoost), stateoftheart deep learning architectures (e.g., TabPFN), and various LLM-based strategies - including both direct prompting and multimodal fusion approaches that employ chain-of-thought reasoning. Experiments conducted under both dataconstrained and datarich scenarios demonstrate that while classical ML models excel in scalability and interpretability, LLM-based methods, especially when guided by chain-of-thought prompts, can effectively leverage their pretrained contextual understanding to achieve competitive performance in lowdata regimes. Moreover, the fusion of multimodal embeddings with structured text provides complementary signals that further improve balanced predictive metrics, thereby laying the groundwork for more accurate and scalable disease prediction in realworld healthcare settings.
Citation
Alvaro Maria Cabrera Berobide, “Fusion of Deep Neural Networks, LLMs, and Traditional Machine Learning for Enhanced Clinical Prediction on Tabular and Multimodal Datasets,” Master of Science thesis, Machine Learning, MBZUAI, 2025.
Source
Conference
Keywords
Multimodal LLMs, Feature Embeddings, Foundation Models, Multimodal Machine Learning, Tabular Data Prediction, Medical AI
