Quantitative LLM Judges

Sahoo, Aishwarya; Karnuthala, Jeevana Kruthi; Budhwani, Tushar Parmanand; Agarwal, Pranchal; Vaidyanathan, Sankaran; Siu, Alexa; Dernoncourt, Franck; Healey, Jennifer; Lipka, Nedim; Rossi, Ryan; Bhattacharya, Uttaran; Kveton, Branislav

Computer Science > Computation and Language

arXiv:2506.02945 (cs)

[Submitted on 3 Jun 2025]

Title:Quantitative LLM Judges

Authors:Aishwarya Sahoo, Jeevana Kruthi Karnuthala, Tushar Parmanand Budhwani, Pranchal Agarwal, Sankaran Vaidyanathan, Alexa Siu, Franck Dernoncourt, Jennifer Healey, Nedim Lipka, Ryan Rossi, Uttaran Bhattacharya, Branislav Kveton

View PDF HTML (experimental)

Abstract:LLM-as-a-judge is a framework in which a large language model (LLM) automatically evaluates the output of another LLM. We propose quantitative LLM judges, which align evaluation scores of existing LLM judges to human scores in a given domain using regression models. The models are trained to improve the score of the original judge by using the judge's textual evaluation and score. We present four quantitative judges for different types of absolute and relative feedback, which showcases the generality and versatility of our framework. Our framework is more computationally efficient than supervised fine-tuning and can be more statistically efficient when human feedback is limited, which is expected in most applications of our work. We validate these claims empirically on four datasets using two base judges. Our experiments show that quantitative judges can effectively improve the predictive power of existing judges through post-hoc modeling.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2506.02945 [cs.CL]
	(or arXiv:2506.02945v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2506.02945

Submission history

From: Branislav Kveton [view email]
[v1] Tue, 3 Jun 2025 14:44:23 UTC (223 KB)

Computer Science > Computation and Language

Title:Quantitative LLM Judges

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Quantitative LLM Judges

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators