Do Large Language Models Align with Core Mental Health Counseling Competencies?
Authors:
Viet Cuong Nguyen,
Mohammad Taher,
Dongwan Hong,
Vinicius Konkolics Possobom,
Vibha Thirunellayi Gopalakrishnan,
Ekta Raj,
Zihang Li,
Heather J. Soled,
Michael L. Birnbaum,
Srijan Kumar,
Munmun De Choudhury
Abstract:
The rapid evolution of Large Language Models (LLMs) presents a promising solution to the global shortage of mental health professionals. However, their alignment with essential counseling competencies remains underexplored. We introduce CounselingBench, a novel NCMHCE-based benchmark evaluating 22 general-purpose and medical-finetuned LLMs across five key competencies. While frontier models surpas…
▽ More
The rapid evolution of Large Language Models (LLMs) presents a promising solution to the global shortage of mental health professionals. However, their alignment with essential counseling competencies remains underexplored. We introduce CounselingBench, a novel NCMHCE-based benchmark evaluating 22 general-purpose and medical-finetuned LLMs across five key competencies. While frontier models surpass minimum aptitude thresholds, they fall short of expert-level performance, excelling in Intake, Assessment & Diagnosis but struggling with Core Counseling Attributes and Professional Practice & Ethics. Surprisingly, medical LLMs do not outperform generalist models in accuracy, though they provide slightly better justifications while making more context-related errors. These findings highlight the challenges of developing AI for mental health counseling, particularly in competencies requiring empathy and nuanced reasoning. Our results underscore the need for specialized, fine-tuned models aligned with core mental health counseling competencies and supported by human oversight before real-world deployment. Code and data associated with this manuscript can be found at: https://github.com/cuongnguyenx/CounselingBench
△ Less
Submitted 26 February, 2025; v1 submitted 29 October, 2024;
originally announced October 2024.