Item

Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

Ye, Tian
Xu, Zicheng
Li, Yuanzhi
Allen-Zhu, Zeyuan
Supervisor
Department
Machine Learning
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Recent advances in language models have demonstrated their capability to solve mathematical reasoning problems, achieving near-perfect accuracy on grade-school level math benchmarks like GSM8K. In this paper, we formally study how language models solve these problems. We design a series of controlled experiments to address several fundamental questions: (1) Can language models truly develop reasoning skills, or do they simply memorize templates? (2) What is the model's hidden (mental) reasoning process? (3) Do models solve math questions using skills similar to or different from humans? (4) Do models trained on GSM8K-like datasets develop reasoning skills beyond those necessary for solving GSM8K problems? (5) What mental process causes models to make reasoning mistakes? (6) How large or deep must a model be to effectively solve GSM8Klevel math questions? Our study uncovers many hidden mechanisms by which language models solve mathematical questions, providing insights that extend beyond current understandings of LLMs. © 2025 13th International Conference on Learning Representations, ICLR 2025. All rights reserved.
Citation
T. Ye, Z. Xu, Y. Li, and Z. Allen-Zhu, “Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process,” International Conference on Representation Learning, vol. 2025, pp. 97699–97709, May 2025
Source
13th International Conference on Learning Representations, ICLR 2025
Conference
13th International Conference on Learning Representations, ICLR 2025
Keywords
Subjects
Source
13th International Conference on Learning Representations, ICLR 2025
Publisher
International Conference on Learning Representations, ICLR
DOI
Full-text link