Skip to main content

Showing 1–1 of 1 results for author: Arderne, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2502.04349  [pdf

    cs.CL cs.AI

    Dynamic benchmarking framework for LLM-based conversational data capture

    Authors: Pietro Alessandro Aluffi, Patrick Zietkiewicz, Marya Bazzi, Matt Arderne, Vladimirs Murevics

    Abstract: The rapid evolution of large language models (LLMs) has transformed conversational agents, enabling complex human-machine interactions. However, evaluation frameworks often focus on single tasks, failing to capture the dynamic nature of multi-turn dialogues. This paper introduces a dynamic benchmarking framework to assess LLM-based conversational agents through interactions with synthetic users. T… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.