Gatekeeper: Improving Model Cascades Through Confidence Tuning

Rabanser, Stephan; Rauschmayr, Nathalie; Kulshrestha, Achin; Poklukar, Petra; Jitkrittum, Wittawat; Augenstein, Sean; Wang, Congchao; Tombari, Federico

Computer Science > Machine Learning

arXiv:2502.19335 (cs)

[Submitted on 26 Feb 2025 (v1), last revised 16 Jun 2025 (this version, v2)]

Title:Gatekeeper: Improving Model Cascades Through Confidence Tuning

Authors:Stephan Rabanser, Nathalie Rauschmayr, Achin Kulshrestha, Petra Poklukar, Wittawat Jitkrittum, Sean Augenstein, Congchao Wang, Federico Tombari

View PDF HTML (experimental)

Abstract:Large-scale machine learning models deliver strong performance across a wide range of tasks but come with significant computational and resource constraints. To mitigate these challenges, local smaller models are often deployed alongside larger models, relying on routing and deferral mechanisms to offload complex tasks. However, existing approaches inadequately balance the capabilities of these models, often resulting in unnecessary deferrals or sub-optimal resource usage. In this work we introduce a novel loss function called Gatekeeper for calibrating smaller models in cascade setups. Our approach fine-tunes the smaller model to confidently handle tasks it can perform correctly while deferring complex tasks to the larger model. Moreover, it incorporates a mechanism for managing the trade-off between model performance and deferral accuracy, and is broadly applicable across various tasks and domains without any architectural changes. We evaluate our method on encoder-only, decoder-only, and encoder-decoder architectures. Experiments across image classification, language modeling, and vision-language tasks show that our approach substantially improves deferral performance.

Comments:	Presented at the TTODLer-FM workshop at the International Conference on Machine Learning (ICML) 2025
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2502.19335 [cs.LG]
	(or arXiv:2502.19335v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2502.19335

Submission history

From: Stephan Rabanser [view email]
[v1] Wed, 26 Feb 2025 17:29:08 UTC (471 KB)
[v2] Mon, 16 Jun 2025 15:32:06 UTC (509 KB)

Computer Science > Machine Learning

Title:Gatekeeper: Improving Model Cascades Through Confidence Tuning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Gatekeeper: Improving Model Cascades Through Confidence Tuning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators