Item

LEMMo-Plan: LLM-Enhanced Learning from Multi-Modal Demonstration for Planning Sequential Contact-Rich Manipulation Tasks

Chen, Kejia
Shen, Zheng
Zhang, Yue
Chen, Lingyun
Wu, Fan
Bing, Zhenshan
Haddadin, Sami
Knoll, Alois
Supervisor
Department
Robotics
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Large Language Models (LLMs) have gained popularity in task planning for long-horizon manipulation tasks. To enhance the validity of LLM-generated plans, visual demonstrations and online videos have been widely employed to guide the planning process. However, for manipulation tasks involving subtle movements but rich contact interactions, visual perception alone may be insufficient for the LLM to fully interpret the demonstration. Additionally, visual data provides limited information on force-related parameters and conditions, which are crucial for effective execution on real robots. In this paper, we introduce LEMMo-Plan, an in-context learning framework that incorporates tactile and force-torque information from human demonstrations to enhance LLMs' ability to generate plans for new task scenarios. We propose a bootstrapped reasoning pipeline that sequentially integrates each modality into a comprehensive task plan. This task plan is then used as a reference for planning in new task configurations. Real-world experiments on two different sequential manipulation tasks demonstrate the effectiveness of our framework in improving LLMs' understanding of multi-modal demonstrations and enhancing the overall planning performance. More materials are available on our project website: lemmo-plan.github.io/LEMMo-Plan/.
Citation
K. Chen et al., "LEMMo-Plan: LLM-Enhanced Learning from Multi-Modal Demonstration for Planning Sequential Contact-Rich Manipulation Tasks," 2025 IEEE International Conference on Robotics and Automation (ICRA), Atlanta, GA, USA, 2025, pp. 11972-11978, doi: 10.1109/ICRA55743.2025.11127842.
Source
International Conference on Robotics and Automation (ICRA)
Conference
2025 IEEE International Conference on Robotics and Automation (ICRA)
Keywords
Visualization, Large Language Models, Pipelines, Transforms, Reliability Engineering, Cognition, Planning, Robots, Visual Perception, Videos
Subjects
Source
2025 IEEE International Conference on Robotics and Automation (ICRA)
Publisher
IEEE
Full-text link