Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses

Baral, Sami; Worden, Eamon; Lim, Wen-Chiang; Luo, Zhuang; Santorelli, Christopher; Gurung, Ashish; Heffernan, Neil

Computer Science > Computers and Society

arXiv:2411.08910 (cs)

[Submitted on 29 Oct 2024]

Title:Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses

Authors:Sami Baral, Eamon Worden, Wen-Chiang Lim, Zhuang Luo, Christopher Santorelli, Ashish Gurung, Neil Heffernan

View PDF HTML (experimental)

Abstract:The effectiveness of feedback in enhancing learning outcomes is well documented within Educational Data Mining (EDM). Various prior research has explored methodologies to enhance the effectiveness of feedback. Recent developments in Large Language Models (LLMs) have extended their utility in enhancing automated feedback systems. This study aims to explore the potential of LLMs in facilitating automated feedback in math education. We examine the effectiveness of LLMs in evaluating student responses by comparing 3 different models: Llama, SBERT-Canberra, and GPT4 model. The evaluation requires the model to provide both a quantitative score and qualitative feedback on the student's responses to open-ended math problems. We employ Mistral, a version of Llama catered to math, and fine-tune this model for evaluating student responses by leveraging a dataset of student responses and teacher-written feedback for middle-school math problems. A similar approach was taken for training the SBERT model as well, while the GPT4 model used a zero-shot learning approach. We evaluate the model's performance in scoring accuracy and the quality of feedback by utilizing judgments from 2 teachers. The teachers utilized a shared rubric in assessing the accuracy and relevance of the generated feedback. We conduct both quantitative and qualitative analyses of the model performance. By offering a detailed comparison of these methods, this study aims to further the ongoing development of automated feedback systems and outlines potential future directions for leveraging generative LLMs to create more personalized learning experiences.

Comments:	12 pages including references, 4 figures, 9 tables
Subjects:	Computers and Society (cs.CY); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2411.08910 [cs.CY]
	(or arXiv:2411.08910v1 [cs.CY] for this version)
	https://doi.org/10.48550/arXiv.2411.08910

Submission history

From: Sami Baral [view email]
[v1] Tue, 29 Oct 2024 16:57:45 UTC (601 KB)

Computer Science > Computers and Society

Title:Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computers and Society

Title:Automated Feedback in Math Education: A Comparative Analysis of LLMs for Open-Ended Responses

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators