RACOON: An LLM-based Framework for Retrieval-Augmented Column Type Annotation with a Knowledge Graph

Wei, Lindsey Linxi; Xiao, Guorui; Balazinska, Magdalena

Computer Science > Databases

arXiv:2409.14556 (cs)

[Submitted on 22 Sep 2024 (v1), last revised 1 Nov 2024 (this version, v2)]

Title:RACOON: An LLM-based Framework for Retrieval-Augmented Column Type Annotation with a Knowledge Graph

Authors:Lindsey Linxi Wei, Guorui Xiao, Magdalena Balazinska

View PDF HTML (experimental)

Abstract:As an important component of data exploration and integration, Column Type Annotation (CTA) aims to label columns of a table with one or more semantic types. With the recent development of Large Language Models (LLMs), researchers have started to explore the possibility of using LLMs for CTA, leveraging their strong zero-shot capabilities. In this paper, we build on this promising work and improve on LLM-based methods for CTA by showing how to use a Knowledge Graph (KG) to augment the context information provided to the LLM. Our approach, called RACOON, combines both pre-trained parametric and non-parametric knowledge during generation to improve LLMs' performance on CTA. Our experiments show that RACOON achieves up to a 0.21 micro F-1 improvement compared against vanilla LLM inference.

Subjects:	Databases (cs.DB); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2409.14556 [cs.DB]
	(or arXiv:2409.14556v2 [cs.DB] for this version)
	https://doi.org/10.48550/arXiv.2409.14556

Submission history

From: Guorui Xiao [view email]
[v1] Sun, 22 Sep 2024 18:39:27 UTC (1,370 KB)
[v2] Fri, 1 Nov 2024 01:15:51 UTC (1,419 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.DB

< prev | next >

new | recent | 2024-09

Change to browse by:

cs
cs.AI

References & Citations

export BibTeX citation

Computer Science > Databases

Title:RACOON: An LLM-based Framework for Retrieval-Augmented Column Type Annotation with a Knowledge Graph

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Databases

Title:RACOON: An LLM-based Framework for Retrieval-Augmented Column Type Annotation with a Knowledge Graph

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators