Search-Based Correction of Reasoning Chains for Language Models

Kim, Minsu; Falet, Jean-Pierre; Richardson, Oliver E.; Chen, Xiaoyin; Jain, Moksh; Ahn, Sungjin; Ahn, Sungsoo; Bengio, Yoshua

Computer Science > Machine Learning

arXiv:2505.11824 (cs)

[Submitted on 17 May 2025]

Title:Search-Based Correction of Reasoning Chains for Language Models

Authors:Minsu Kim, Jean-Pierre Falet, Oliver E. Richardson, Xiaoyin Chen, Moksh Jain, Sungjin Ahn, Sungsoo Ahn, Yoshua Bengio

View PDF HTML (experimental)

Abstract:Chain-of-Thought (CoT) reasoning has advanced the capabilities and transparency of language models (LMs); however, reasoning chains can contain inaccurate statements that reduce performance and trustworthiness. To address this, we introduce a new self-correction framework that augments each reasoning step in a CoT with a latent variable indicating its veracity, enabling modeling of all possible truth assignments rather than assuming correctness throughout. To efficiently explore this expanded space, we introduce Search Corrector, a discrete search algorithm over boolean-valued veracity assignments. It efficiently performs otherwise intractable inference in the posterior distribution over veracity assignments by leveraging the LM's joint likelihood over veracity and the final answer as a proxy reward. This efficient inference-time correction method facilitates supervised fine-tuning of an Amortized Corrector by providing pseudo-labels for veracity. The Amortized Corrector generalizes self-correction, enabling accurate zero-shot veracity inference in novel contexts. Empirical results demonstrate that Search Corrector reliably identifies errors in logical (ProntoQA) and mathematical reasoning (GSM8K) benchmarks. The Amortized Corrector achieves comparable zero-shot accuracy and improves final answer accuracy by up to 25%.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2505.11824 [cs.LG]
	(or arXiv:2505.11824v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2505.11824

Submission history

From: Jean-Pierre Falet [view email]
[v1] Sat, 17 May 2025 04:16:36 UTC (2,040 KB)

Computer Science > Machine Learning

Title:Search-Based Correction of Reasoning Chains for Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Search-Based Correction of Reasoning Chains for Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators