TARGET: Benchmarking Table Retrieval for Generative Tasks

Ji, Xingyu; Glenn, Parker; Parameswaran, Aditya G.; Hulsebos, Madelon

Computer Science > Information Retrieval

arXiv:2505.11545 (cs)

[Submitted on 14 May 2025]

Title:TARGET: Benchmarking Table Retrieval for Generative Tasks

Authors:Xingyu Ji, Parker Glenn, Aditya G. Parameswaran, Madelon Hulsebos

View PDF HTML (experimental)

Abstract:The data landscape is rich with structured data, often of high value to organizations, driving important applications in data analysis and machine learning. Recent progress in representation learning and generative models for such data has led to the development of natural language interfaces to structured data, including those leveraging text-to-SQL. Contextualizing interactions, either through conversational interfaces or agentic components, in structured data through retrieval-augmented generation can provide substantial benefits in the form of freshness, accuracy, and comprehensiveness of answers. The key question is: how do we retrieve the right table(s) for the analytical query or task at hand? To this end, we introduce TARGET: a benchmark for evaluating TAble Retrieval for GEnerative Tasks. With TARGET we analyze the retrieval performance of different retrievers in isolation, as well as their impact on downstream tasks. We find that dense embedding-based retrievers far outperform a BM25 baseline which is less effective than it is for retrieval over unstructured text. We also surface the sensitivity of retrievers across various metadata (e.g., missing table titles), and demonstrate a stark variation of retrieval performance across datasets and tasks. TARGET is available at this https URL.

Subjects:	Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Databases (cs.DB)
Cite as:	arXiv:2505.11545 [cs.IR]
	(or arXiv:2505.11545v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2505.11545

Submission history

From: Madelon Hulsebos [view email]
[v1] Wed, 14 May 2025 19:39:46 UTC (209 KB)

Computer Science > Information Retrieval

Title:TARGET: Benchmarking Table Retrieval for Generative Tasks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:TARGET: Benchmarking Table Retrieval for Generative Tasks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators