Item

Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs

Puerto, Haritz
Chubakov, Tilek
Zhu, Xiaodan
Tayyar Madabushi, Harish
Gurevych, Iryna
Supervisor
Department
Natural Language Processing
Embargo End Date
Type
Conference proceeding
Date
2025
License
Language
English
Collections
Research Projects
Organizational Units
Journal Issue
Abstract
Requiring a large language model (LLM) to generate intermediary reasoning steps, known as Chain of Thought (CoT), has been shown to be an effective way of boosting performance. Previous approaches have focused on generating multiple independent CoTs, combining them through ensembling or other post-hoc strategies to enhance reasoning. In this work, we introduce a novel approach where LLMs are fine-tuned to generate a sequence of Diverse Chains of Thought (DCoT) within a single inference step, which is fundamentally different from prior work that primarily operate on parallel CoT generations. DCoT allows LLMs to gain the ability to perform within-inference refinement of reasoning chains without requiring external feedback. Through a rigorous set of experiments spanning a wide range of tasks that require various reasoning types, we show that fine-tuning on DCoT improves performance over the CoT baseline across model families and scales (1.3B to 70B). These improvements are particularly impactful for tasks with a large result state space, such as those involving numeric answers. Our work is also significant because both quantitative analyses and manual evaluations reveal the observed gains stem from the models’ ability to refine an initial reasoning chain by generating a second, improved chain within the same inference step, demonstrating previously elusive self-improvement. Our code and data are publicly available.
Citation
H. Puerto, T. Chubakov, X. Zhu, H. T. Madabushi, and I. Gurevych, “Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs,” 2025. [Online]. Available: https://aclanthology.org/2025.acl-long.191/
Source
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics
Conference
63rd Annual Meeting of the Association for Computational Linguistics, 2025
Keywords
Subjects
Source
63rd Annual Meeting of the Association for Computational Linguistics, 2025
Publisher
Association for Computational Linguistics
DOI
Full-text link