EventChat: Implementation and user-centric evaluation of a large language model-driven conversational recommender system for exploring leisure events in an SME context

Kunstmann, Hannes; Ollier, Joseph; Persson, Joel; von Wangenheim, Florian

Computer Science > Information Retrieval

arXiv:2407.04472 (cs)

[Submitted on 5 Jul 2024 (v1), last revised 9 Jul 2024 (this version, v3)]

Title:EventChat: Implementation and user-centric evaluation of a large language model-driven conversational recommender system for exploring leisure events in an SME context

Authors:Hannes Kunstmann, Joseph Ollier, Joel Persson, Florian von Wangenheim

View PDF

Abstract:Large language models (LLMs) present an enormous evolution in the strategic potential of conversational recommender systems (CRS). Yet to date, research has predominantly focused upon technical frameworks to implement LLM-driven CRS, rather than end-user evaluations or strategic implications for firms, particularly from the perspective of a small to medium enterprises (SME) that makeup the bedrock of the global economy. In the current paper, we detail the design of an LLM-driven CRS in an SME setting, and its subsequent performance in the field using both objective system metrics and subjective user evaluations. While doing so, we additionally outline a short-form revised ResQue model for evaluating LLM-driven CRS, enabling replicability in a rapidly evolving field. Our results reveal good system performance from a user experience perspective (85.5% recommendation accuracy) but underscore latency, cost, and quality issues challenging business viability. Notably, with a median cost of $0.04 per interaction and a latency of 5.7s, cost-effectiveness and response time emerge as crucial areas for achieving a more user-friendly and economically viable LLM-driven CRS for SME settings. One major driver of these costs is the use of an advanced LLM as a ranker within the retrieval-augmented generation (RAG) technique. Our results additionally indicate that relying solely on approaches such as Prompt-based learning with ChatGPT as the underlying LLM makes it challenging to achieve satisfying quality in a production environment. Strategic considerations for SMEs deploying an LLM-driven CRS are outlined, particularly considering trade-offs in the current technical landscape.

Comments:	27 pages, 3 tables, 5 figures, pre-print manuscript, updated version of manuscript due to typo (previous version, Figure 5 was incorrectly named Figure 6)
Subjects:	Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
MSC classes:	68T50
ACM classes:	I.2.7; H.5.2
Cite as:	arXiv:2407.04472 [cs.IR]
	(or arXiv:2407.04472v3 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2407.04472

Submission history

From: Joseph Ollier Dr [view email]
[v1] Fri, 5 Jul 2024 12:42:31 UTC (691 KB)
[v2] Mon, 8 Jul 2024 14:50:49 UTC (693 KB)
[v3] Tue, 9 Jul 2024 13:31:00 UTC (699 KB)

Computer Science > Information Retrieval

Title:EventChat: Implementation and user-centric evaluation of a large language model-driven conversational recommender system for exploring leisure events in an SME context

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:EventChat: Implementation and user-centric evaluation of a large language model-driven conversational recommender system for exploring leisure events in an SME context

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators