Multi-Agent Large Language Models for Zero-Shot Robotic Manipulation
Singh, Harsh
Singh, Harsh
Author
Supervisor
Department
Computer Vision
Embargo End Date
2025-05-30
Type
Thesis
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Large Language Models (LLMs) have demonstrated remarkable planning abilities across various domains, including robotic manipulation and navigation. While recent work in robotics deploys LLMs for highlevel and lowlevel planning, existing methods often face challenges with failure recovery and suffer from hallucinations. To address these limitations, we propose a novel multiagent LLM framework, Multi Agent Large Language Model for Manipulation (MALMM). Notably, MALMM distributes planning across three specialized LLM agents, namely highlevel planning agent, low-level control agent, and a supervisor agent. Unlike existing methods, MALMM does not rely on pretrained skill policies or incontext learn ing examples and generalizes to unseen tasks. In our experiments, MALMM demonstrates excellent performance in solving previously unseen tasks, and outperforms existing zero-shot LLMbased methods in RLBench. Experiments with Franka arm validates our approach in the realworld.
Citation
Harsh Singh, “Multi-Agent Large Language Models for Zero-Shot Robotic Manipulation,” Master of Science thesis, Computer Vision, MBZUAI, 2025.
Source
Conference
Keywords
Robotic Manipulation, Embodied AI, LLM Agents, Robotic Agents
