UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Yang, Yuzhe; Zhang, Yifei; Hu, Yan; Guo, Yilin; Gan, Ruoli; He, Yueru; Lei, Mingcong; Zhang, Xiao; Wang, Haining; Xie, Qianqian; Huang, Jimin; Yu, Honghai; Wang, Benyou

Quantitative Finance > Computational Finance

arXiv:2410.14059 (q-fin)

[Submitted on 17 Oct 2024 (v1), last revised 7 Feb 2025 (this version, v3)]

Title:UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Authors:Yuzhe Yang, Yifei Zhang, Yan Hu, Yilin Guo, Ruoli Gan, Yueru He, Mingcong Lei, Xiao Zhang, Haining Wang, Qianqian Xie, Jimin Huang, Honghai Yu, Benyou Wang

View PDF HTML (experimental)

Abstract:This paper introduces the UCFE: User-Centric Financial Expertise benchmark, an innovative framework designed to evaluate the ability of large language models (LLMs) to handle complex real-world financial tasks. UCFE benchmark adopts a hybrid approach that combines human expert evaluations with dynamic, task-specific interactions to simulate the complexities of evolving financial scenarios. Firstly, we conducted a user study involving 804 participants, collecting their feedback on financial tasks. Secondly, based on this feedback, we created our dataset that encompasses a wide range of user intents and interactions. This dataset serves as the foundation for benchmarking 11 LLMs services using the LLM-as-Judge methodology. Our results show a significant alignment between benchmark scores and human preferences, with a Pearson correlation coefficient of 0.78, confirming the effectiveness of the UCFE dataset and our evaluation approach. UCFE benchmark not only reveals the potential of LLMs in the financial domain but also provides a robust framework for assessing their performance and user satisfaction.

Subjects:	Computational Finance (q-fin.CP); Computational Engineering, Finance, and Science (cs.CE); Computation and Language (cs.CL)
Cite as:	arXiv:2410.14059 [q-fin.CP]
	(or arXiv:2410.14059v3 [q-fin.CP] for this version)
	https://doi.org/10.48550/arXiv.2410.14059

Submission history

From: Yuzhe Yang [view email]
[v1] Thu, 17 Oct 2024 22:03:52 UTC (4,789 KB)
[v2] Tue, 22 Oct 2024 06:47:43 UTC (4,764 KB)
[v3] Fri, 7 Feb 2025 08:37:03 UTC (4,789 KB)

Quantitative Finance > Computational Finance

Title:UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Quantitative Finance > Computational Finance

Title:UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators