A Comprehensive Evaluation of Tool-Assisted Generation Strategies

Jacovi, Alon; Caciularu, Avi; Herzig, Jonathan; Aharoni, Roee; Bohnet, Bernd; Geva, Mor

Computer Science > Computation and Language

arXiv:2310.10062 (cs)

[Submitted on 16 Oct 2023 (v1), last revised 28 Dec 2023 (this version, v2)]

Title:A Comprehensive Evaluation of Tool-Assisted Generation Strategies

Authors:Alon Jacovi, Avi Caciularu, Jonathan Herzig, Roee Aharoni, Bernd Bohnet, Mor Geva

View PDF HTML (experimental)

Abstract:A growing area of research investigates augmenting language models with tools (e.g., search engines, calculators) to overcome their shortcomings (e.g., missing or incorrect knowledge, incorrect logical inferences). Various few-shot tool-usage strategies have been proposed. However, there is no systematic and fair comparison across different strategies, or between these strategies and strong baselines that do not leverage tools. We conduct an extensive empirical analysis, finding that (1) across various datasets, example difficulty levels, and models, strong no-tool baselines are competitive to tool-assisted strategies, implying that effectively using tools with in-context demonstrations is a difficult unsolved problem; (2) for knowledge-retrieval tasks, strategies that *refine* incorrect outputs with tools outperform strategies that retrieve relevant information *ahead of* or *during generation*; (3) tool-assisted strategies are expensive in the number of tokens they require to work -- incurring additional costs by orders of magnitude -- which does not translate into significant improvement in performance. Overall, our findings suggest that few-shot tool integration is still an open challenge, emphasizing the need for comprehensive evaluations of future strategies to accurately assess their *benefits* and *costs*.

Comments:	Accepted to EMNLP 2023 Findings
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2310.10062 [cs.CL]
	(or arXiv:2310.10062v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2310.10062

Submission history

From: Alon Jacovi [view email]
[v1] Mon, 16 Oct 2023 04:53:22 UTC (1,194 KB)
[v2] Thu, 28 Dec 2023 15:41:35 UTC (1,193 KB)

Computer Science > Computation and Language

Title:A Comprehensive Evaluation of Tool-Assisted Generation Strategies

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:A Comprehensive Evaluation of Tool-Assisted Generation Strategies

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators