DEBATE, TRAIN, EVOLVE: Self Evolution of Language Model Reasoning

Srivastava, Gaurav; Bi, Zhenyu; Lu, Meng; Wang, Xuan

Computer Science > Computation and Language

arXiv:2505.15734 (cs)

[Submitted on 21 May 2025]

Title:DEBATE, TRAIN, EVOLVE: Self Evolution of Language Model Reasoning

Authors:Gaurav Srivastava, Zhenyu Bi, Meng Lu, Xuan Wang

View PDF HTML (experimental)

Abstract:Large language models (LLMs) have improved significantly in their reasoning through extensive training on massive datasets. However, relying solely on additional data for improvement is becoming increasingly impractical, highlighting the need for models to autonomously enhance their reasoning without external supervision. In this paper, we propose Debate, Train, Evolve (DTE), a novel ground truth-free training framework that uses multi-agent debate traces to evolve a single language model. We also introduce a new prompting strategy Reflect-Critique-Refine, to improve debate quality by explicitly instructing agents to critique and refine their reasoning. Extensive evaluations on five reasoning benchmarks with six open-weight models show that our DTE framework achieve substantial improvements, with an average accuracy gain of 8.92% on the challenging GSM-PLUS dataset. Furthermore, we observe strong cross-domain generalization, with an average accuracy gain of 5.8% on all other benchmarks, suggesting that our method captures general reasoning capabilities.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2505.15734 [cs.CL]
	(or arXiv:2505.15734v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2505.15734

Submission history

From: Gaurav Srivastava [view email]
[v1] Wed, 21 May 2025 16:40:12 UTC (1,383 KB)

Computer Science > Computation and Language

Title:DEBATE, TRAIN, EVOLVE: Self Evolution of Language Model Reasoning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:DEBATE, TRAIN, EVOLVE: Self Evolution of Language Model Reasoning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators