LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Yang, Kaiyu; Swope, Aidan M.; Gu, Alex; Chalamala, Rahul; Song, Peiyang; Yu, Shixing; Godil, Saad; Prenger, Ryan; Anandkumar, Anima

Computer Science > Machine Learning

arXiv:2306.15626 (cs)

[Submitted on 27 Jun 2023 (v1), last revised 27 Oct 2023 (this version, v2)]

Title:LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Authors:Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar

View PDF

Abstract:Large language models (LLMs) have shown promise in proving formal theorems using proof assistants such as Lean. However, existing methods are difficult to reproduce or build on, due to private code, data, and large compute requirements. This has created substantial barriers to research on machine learning methods for theorem proving. This paper removes these barriers by introducing LeanDojo: an open-source Lean playground consisting of toolkits, data, models, and benchmarks. LeanDojo extracts data from Lean and enables interaction with the proof environment programmatically. It contains fine-grained annotations of premises in proofs, providing valuable data for premise selection: a key bottleneck in theorem proving. Using this data, we develop ReProver (Retrieval-Augmented Prover): an LLM-based prover augmented with retrieval for selecting premises from a vast math library. It is inexpensive and needs only one GPU week of training. Our retriever leverages LeanDojo's program analysis capability to identify accessible premises and hard negative examples, which makes retrieval much more effective. Furthermore, we construct a new benchmark consisting of 98,734 theorems and proofs extracted from Lean's math library. It features challenging data split requiring the prover to generalize to theorems relying on novel premises that are never used in training. We use this benchmark for training and evaluation, and experimental results demonstrate the effectiveness of ReProver over non-retrieval baselines and GPT-4. We thus provide the first set of open-source LLM-based theorem provers without any proprietary datasets and release it under a permissive MIT license to facilitate further research.

Comments:	Accepted to NeurIPS 2023 (Datasets and Benchmarks Track) as an oral presentation. Data, code, and models available at this https URL
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO); Machine Learning (stat.ML)
Cite as:	arXiv:2306.15626 [cs.LG]
	(or arXiv:2306.15626v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.15626

Submission history

From: Kaiyu Yang [view email]
[v1] Tue, 27 Jun 2023 17:05:32 UTC (2,908 KB)
[v2] Fri, 27 Oct 2023 16:00:20 UTC (3,429 KB)

Computer Science > Machine Learning

Title:LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Submission history

Access Paper:

References & Citations

1 blog link

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:LeanDojo: Theorem Proving with Retrieval-Augmented Language Models

Submission history

Access Paper:

References & Citations

1 blog link

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators