Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

Šakota, Marija; Peyrard, Maxime; West, Robert

doi:10.1145/3616855.3635825

Computer Science > Computation and Language

arXiv:2308.06077 (cs)

[Submitted on 11 Aug 2023 (v1), last revised 18 Dec 2023 (this version, v3)]

Title:Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

Authors:Marija Šakota, Maxime Peyrard, Robert West

View PDF HTML (experimental)

Abstract:Generative language models (LMs) have become omnipresent across data science. For a wide variety of tasks, inputs can be phrased as natural language prompts for an LM, from whose output the solution can then be extracted. LM performance has consistently been increasing with model size - but so has the monetary cost of querying the ever larger models. Importantly, however, not all inputs are equally hard: some require larger LMs for obtaining a satisfactory solution, whereas for others smaller LMs suffice. Based on this fact, we design a framework for cost-effective language model choice, called "Fly-swat or cannon" (FORC). Given a set of inputs and a set of candidate LMs, FORC judiciously assigns each input to an LM predicted to do well on the input according to a so-called meta-model, aiming to achieve high overall performance at low cost. The cost-performance tradeoff can be flexibly tuned by the user. Options include, among others, maximizing total expected performance (or the number of processed inputs) while staying within a given cost budget, or minimizing total cost while processing all inputs. We evaluate FORC on 14 datasets covering five natural language tasks, using four candidate LMs of vastly different size and cost. With FORC, we match the performance of the largest available LM while achieving a cost reduction of 63%. Via our publicly available library, researchers as well as practitioners can thus save large amounts of money without sacrificing performance.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2308.06077 [cs.CL]
	(or arXiv:2308.06077v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2308.06077
Related DOI:	https://doi.org/10.1145/3616855.3635825

Submission history

From: Marija Sakota [view email]
[v1] Fri, 11 Aug 2023 11:29:51 UTC (640 KB)
[v2] Tue, 12 Dec 2023 16:39:26 UTC (663 KB)
[v3] Mon, 18 Dec 2023 08:26:48 UTC (663 KB)

Computer Science > Computation and Language

Title:Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Fly-Swat or Cannon? Cost-Effective Language Model Choice via Meta-Modeling

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators