PsyEval: A Suite of Mental Health Related Tasks for Evaluating Large Language Models

Jin, Haoan; Chen, Siyuan; Dilixiati, Dilawaier; Jiang, Yewei; Wu, Mengyue; Zhu, Kenny Q.

Computer Science > Computation and Language

arXiv:2311.09189 (cs)

[Submitted on 15 Nov 2023 (v1), last revised 3 Jun 2024 (this version, v2)]

Title:PsyEval: A Suite of Mental Health Related Tasks for Evaluating Large Language Models

Authors:Haoan Jin, Siyuan Chen, Dilawaier Dilixiati, Yewei Jiang, Mengyue Wu, Kenny Q. Zhu

View PDF HTML (experimental)

Abstract:Evaluating Large Language Models (LLMs) in the mental health domain poses distinct challenged from other domains, given the subtle and highly subjective nature of symptoms that exhibit significant variability among individuals. This paper presents PsyEval, the first comprehensive suite of mental health-related tasks for evaluating LLMs. PsyEval encompasses five sub-tasks that evaluate three critical dimensions of mental health. This comprehensive framework is designed to thoroughly assess the unique challenges and intricacies of mental health-related tasks, making PsyEval a highly specialized and valuable tool for evaluating LLM performance in this domain. We evaluate twelve advanced LLMs using PsyEval. Experiment results not only demonstrate significant room for improvement in current LLMs concerning mental health but also unveil potential directions for future model optimization.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2311.09189 [cs.CL]
	(or arXiv:2311.09189v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2311.09189

Submission history

From: Haoan Jin [view email]
[v1] Wed, 15 Nov 2023 18:32:27 UTC (674 KB)
[v2] Mon, 3 Jun 2024 08:37:10 UTC (3,927 KB)

Computer Science > Computation and Language

Title:PsyEval: A Suite of Mental Health Related Tasks for Evaluating Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:PsyEval: A Suite of Mental Health Related Tasks for Evaluating Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators