LM-Polygraph: Uncertainty Estimation for Language Models

Fadeeva, Ekaterina; Vashurin, Roman; Tsvigun, Akim; Vazhentsev, Artem; Petrakov, Sergey; Fedyanin, Kirill; Vasilev, Daniil; Goncharova, Elizaveta; Panchenko, Alexander; Panov, Maxim; Baldwin, Timothy; Shelmanov, Artem

Computer Science > Computation and Language

arXiv:2311.07383 (cs)

[Submitted on 13 Nov 2023]

Title:LM-Polygraph: Uncertainty Estimation for Language Models

Authors:Ekaterina Fadeeva, Roman Vashurin, Akim Tsvigun, Artem Vazhentsev, Sergey Petrakov, Kirill Fedyanin, Daniil Vasilev, Elizaveta Goncharova, Alexander Panchenko, Maxim Panov, Timothy Baldwin, Artem Shelmanov

View PDF

Abstract:Recent advancements in the capabilities of large language models (LLMs) have paved the way for a myriad of groundbreaking applications in various fields. However, a significant challenge arises as these models often "hallucinate", i.e., fabricate facts without providing users an apparent means to discern the veracity of their statements. Uncertainty estimation (UE) methods are one path to safer, more responsible, and more effective use of LLMs. However, to date, research on UE methods for LLMs has been focused primarily on theoretical rather than engineering contributions. In this work, we tackle this issue by introducing LM-Polygraph, a framework with implementations of a battery of state-of-the-art UE methods for LLMs in text generation tasks, with unified program interfaces in Python. Additionally, it introduces an extendable benchmark for consistent evaluation of UE techniques by researchers, and a demo web application that enriches the standard chat dialog with confidence scores, empowering end-users to discern unreliable responses. LM-Polygraph is compatible with the most recent LLMs, including BLOOMz, LLaMA-2, ChatGPT, and GPT-4, and is designed to support future releases of similarly-styled LMs.

Comments:	Accepted at EMNLP-2023
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2311.07383 [cs.CL]
	(or arXiv:2311.07383v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2311.07383

Submission history

From: Artem Shelmanov [view email]
[v1] Mon, 13 Nov 2023 15:08:59 UTC (3,171 KB)

Computer Science > Computation and Language

Title:LM-Polygraph: Uncertainty Estimation for Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:LM-Polygraph: Uncertainty Estimation for Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators