Item

Finite-Sample Convergence Bounds for Trust Region Policy Optimization in Mean-Field Games

Ocello, Antonio
Tiapkin, Daniil
Mancini, Lorenzo
Laurière, Mathieu
Moulines, Éric
Supervisor
Department
Machine Learning
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
We introduce Mean-Field Trust Region Policy Optimization (MF-TRPO), a novel algorithm designed to compute approximate Nash equilibria for ergodic Mean-Field Games (MFG) in finite state-action spaces. Building on the wellestablished performance of TRPO in the reinforcement learning (RL) setting, we extend its methodology to the MFG framework, leveraging its stability and robustness in policy optimization. Under standard assumptions in the MFG literature, we provide a rigorous analysis of MF-TRPO, establishing theoretical guarantees on its convergence. Our results cover both the exact formulation of the algorithm and its sample-based counterpart, where we derive high-probability guarantees and finite sample complexity. This work advances MFG optimization by bridging RL techniques with mean-field decision-making, offering a theoretically grounded approach to solving complex multi-agent problems.
Citation
A. Ocello, D. Tiapkin, L. Mancini, M. Lauriere, and E. Moulines, “Finite-Sample Convergence Bounds for Trust Region Policy Optimization in Mean Field Games,” Oct. 06, 2025, PMLR. [Online]. Available: https://proceedings.mlr.press/v267/ocello25a.html
Source
Proceedings of Machine Learning Research
Conference
42nd International Conference on Machine Learning, ICML 2025
Keywords
Subjects
Source
42nd International Conference on Machine Learning, ICML 2025
Publisher
ML Research Press
DOI
Full-text link