SmartLLM: Multi-Dimensional Dataset Generation via LLM Simulation in Smart Home
Zheng, Huanke ; Wang, Rui ; Yang, Qin ; AlQahtan, Salman A ; Chen, Min ; Guizani, Mohsen ; Fortino, Giancarlo
Zheng, Huanke
Wang, Rui
Yang, Qin
AlQahtan, Salman A
Chen, Min
Guizani, Mohsen
Fortino, Giancarlo
Supervisor
Department
Machine Learning
Embargo End Date
Type
Journal article
Date
License
Language
ENglish
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Human activity prediction is crucial for enabling intelligent smart home services, yet it is often hindered by the scarcity of high-quality, multi-dimensional datasets. Existing datasets are typically fragmented, capturing either long-term activity sequences or short-term device interactions, but rarely both in a unified manner. Traditional data collection methods are costly and time-consuming, while conventional simulation techniques struggle to generate diverse and logically coherent behavior sequences. To address these limitations, we propose SmartLLM, a novel Large Language Model (LLM)-based simulation framework for automated generation of multi-dimensional smart home datasets. SmartLLM simulates simulated agents with distinct profiles (e.g., old man, remote worker, holiday maker) performing daily activities within configurable home environments, generating temporally aligned sequences across Activity-Device-Sensor dimensions. We generate two months of simulated data for three user profiles and validated their plausibility through activity distribution visualization, statistical perplexity analysis, and case studies. Multi-dimensional feature validation experiments further demonstrate that our multi-dimensional data significantly enhances the accuracy of activity prediction models compared to using single-dimensional features. This work successfully addresses key bottlenecks in smart home data acquisition and provides a scalable, high-quality data foundation for advancing smart home algorithm research. The code is available at https://github.com/HuankeZheng/SmartLLM.
Citation
H. Zheng, R. Wang, Q. Yang, S.A. AlQahtan, M. Chen, M. Guizani , et al., "SmartLLM: Multi-Dimensional Dataset Generation via LLM Simulation in Smart Home," IEEE Internet of Things Journal, vol. PP, no. 99, pp. 1-1, 2026, https://doi.org/10.1109/jiot.2026.3663651.
Source
IEEE Internet of Things Journal
Conference
Keywords
46 Information and Computing Sciences, 4605 Data Management and Data Science
Subjects
Source
Publisher
IEEE
