The Factuality of Large Language Models in the Legal Domain

Hamdani, Rajaa El; Bonald, Thomas; Malliaros, Fragkiskos; Holzenberger, Nils; Suchanek, Fabian

Computer Science > Computation and Language

arXiv:2409.11798 (cs)

[Submitted on 18 Sep 2024]

Title:The Factuality of Large Language Models in the Legal Domain

Authors:Rajaa El Hamdani, Thomas Bonald, Fragkiskos Malliaros, Nils Holzenberger, Fabian Suchanek

View PDF HTML (experimental)

Abstract:This paper investigates the factuality of large language models (LLMs) as knowledge bases in the legal domain, in a realistic usage scenario: we allow for acceptable variations in the answer, and let the model abstain from answering when uncertain. First, we design a dataset of diverse factual questions about case law and legislation. We then use the dataset to evaluate several LLMs under different evaluation methods, including exact, alias, and fuzzy matching. Our results show that the performance improves significantly under the alias and fuzzy matching methods. Further, we explore the impact of abstaining and in-context examples, finding that both strategies enhance precision. Finally, we demonstrate that additional pre-training on legal documents, as seen with SaulLM, further improves factual precision from 63% to 81%.

Comments:	CIKM 2024, short paper
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Machine Learning (cs.LG)
Cite as:	arXiv:2409.11798 [cs.CL]
	(or arXiv:2409.11798v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2409.11798

Submission history

From: Rajaa El Hamdani [view email]
[v1] Wed, 18 Sep 2024 08:30:20 UTC (7,181 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2024-09

Change to browse by:

cs
cs.AI
cs.IR
cs.LG

References & Citations

export BibTeX citation

Computer Science > Computation and Language

Title:The Factuality of Large Language Models in the Legal Domain

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:The Factuality of Large Language Models in the Legal Domain

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators