LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers

Olausson, Theo X.; Gu, Alex; Lipkin, Benjamin; Zhang, Cedegao E.; Solar-Lezama, Armando; Tenenbaum, Joshua B.; Levy, Roger

doi:10.18653/v1/2023.emnlp-main.313

Computer Science > Computation and Language

arXiv:2310.15164 (cs)

[Submitted on 23 Oct 2023 (v1), last revised 14 Feb 2024 (this version, v2)]

Title:LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers

Authors:Theo X. Olausson, Alex Gu, Benjamin Lipkin, Cedegao E. Zhang, Armando Solar-Lezama, Joshua B. Tenenbaum, Roger Levy

View PDF

Abstract:Logical reasoning, i.e., deductively inferring the truth value of a conclusion from a set of premises, is an important task for artificial intelligence with wide potential impacts on science, mathematics, and society. While many prompting-based strategies have been proposed to enable Large Language Models (LLMs) to do such reasoning more effectively, they still appear unsatisfactory, often failing in subtle and unpredictable ways. In this work, we investigate the validity of instead reformulating such tasks as modular neurosymbolic programming, which we call LINC: Logical Inference via Neurosymbolic Computation. In LINC, the LLM acts as a semantic parser, translating premises and conclusions from natural language to expressions in first-order logic. These expressions are then offloaded to an external theorem prover, which symbolically performs deductive inference. Leveraging this approach, we observe significant performance gains on FOLIO and a balanced subset of ProofWriter for three different models in nearly all experimental conditions we evaluate. On ProofWriter, augmenting the comparatively small open-source StarCoder+ (15.5B parameters) with LINC even outperforms GPT-3.5 and GPT-4 with Chain-of-Thought (CoT) prompting by an absolute 38% and 10%, respectively. When used with GPT-4, LINC scores 26% higher than CoT on ProofWriter while performing comparatively on FOLIO. Further analysis reveals that although both methods on average succeed roughly equally often on this dataset, they exhibit distinct and complementary failure modes. We thus provide promising evidence for how logical reasoning over natural language can be tackled through jointly leveraging LLMs alongside symbolic provers. All corresponding code is publicly available at this https URL

Comments:	EMNLP Main 2023 (Outstanding Paper Award)
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2310.15164 [cs.CL]
	(or arXiv:2310.15164v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2310.15164
Journal reference:	Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 5153-5176, Singapore. Association for Computational Linguistics
Related DOI:	https://doi.org/10.18653/v1/2023.emnlp-main.313

Submission history

From: Benjamin Lipkin [view email]
[v1] Mon, 23 Oct 2023 17:58:40 UTC (344 KB)
[v2] Wed, 14 Feb 2024 18:56:03 UTC (344 KB)

Computer Science > Computation and Language

Title:LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators