Item

Hier-SLAM: Scaling-Up Semantics in SLAM with a Hierarchically Categorical Gaussian Splatting

Li, Boying
Cai, Zhixi
Li, Yuan-Fang
Reid, Ian
Rezatofighi, Hamid
Supervisor
Department
Computer Vision
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
We propose Hier-SLAM, a semantic 3D Gaussian Splatting SLAM method featuring a novel hierarchical categorical representation, which enables accurate global 3D semantic mapping, scaling-up capability, and explicit semantic label prediction in the 3D world. The parameter usage in semantic SLAM systems increases significantly with the growing complexity of the environment, making it particularly challenging and costly for scene understanding. To address this problem, we introduce a novel hierarchical representation that encodes semantic information in a compact form into 3D Gaussian Splatting, leveraging the capabilities of large language models (LLMs). We further introduce a novel semantic loss designed to optimize hierarchical semantic information through both inter-level and cross-level optimization. Furthermore, we enhance the whole SLAM system, resulting in improved tracking and mapping performance. Our Hier-SLAM outperforms existing dense SLAM methods in both mapping and tracking accuracy, while achieving a 2x operation speed-up. Additionally, it achieves on-par semantic rendering performance compared to existing methods while significantly reducing storage and training time requirements. Rendering FPS impressively reaches 2,000 with semantic information and 3,000 without it. Most notably, it showcases the capability of handling the complex real-world scene with more than 500 semantic classes, highlighting its valuable scaling-up capability. The open-source code is available at https://github.com/LeeBY68/Hier-SLAM.
Citation
B. Li, Z. Cai, Y. -F. Li, I. Reid and H. Rezatofighi, "Hier-SLAM: Scaling-Up Semantics in SLAM with a Hierarchically Categorical Gaussian Splatting," 2025 IEEE International Conference on Robotics and Automation (ICRA), Atlanta, GA, USA, 2025, pp. 9748-9754, doi: 10.1109/ICRA55743.2025.11127775.
Source
International Conference on Robotics and Automation (ICRA)
Conference
2025 IEEE International Conference on Robotics and Automation (ICRA)
Keywords
Training, Simultaneous Localization and Mapping, Three-Dimensional Displays, Accuracy, Scalability, Large Language Models, Semantics, Rendering (Computer Graphics), Robotics and Automation, Optimization
Subjects
Source
2025 IEEE International Conference on Robotics and Automation (ICRA)
Publisher
IEEE
Full-text link