UserSimCRS: A User Simulation Toolkit for Evaluating Conversational Recommender Systems
Authors:
Jafar Afzali,
Aleksander Mark Drzewiecki,
Krisztian Balog,
Shuo Zhang
Abstract:
We present an extensible user simulation toolkit to facilitate automatic evaluation of conversational recommender systems. It builds on an established agenda-based approach and extends it with several novel elements, including user satisfaction prediction, persona and context modeling, and conditional natural language generation. We showcase the toolkit with a pre-existing movie recommender system…
▽ More
We present an extensible user simulation toolkit to facilitate automatic evaluation of conversational recommender systems. It builds on an established agenda-based approach and extends it with several novel elements, including user satisfaction prediction, persona and context modeling, and conditional natural language generation. We showcase the toolkit with a pre-existing movie recommender system and demonstrate its ability to simulate dialogues that mimic real conversations, while requiring only a handful of manually annotated dialogues as training data.
△ Less
Submitted 24 January, 2023; v1 submitted 13 January, 2023;
originally announced January 2023.
POINTREC: A Test Collection for Narrative-driven Point of Interest Recommendation
Authors:
Jafar Afzali,
Aleksander Mark Drzewiecki,
Krisztian Balog
Abstract:
This paper presents a test collection for contextual point of interest (POI) recommendation in a narrative-driven scenario. There, user history is not available, instead, user requests are described in natural language. The requests in our collection are manually collected from social sharing websites, and are annotated with various types of metadata, including location, categories, constraints, a…
▽ More
This paper presents a test collection for contextual point of interest (POI) recommendation in a narrative-driven scenario. There, user history is not available, instead, user requests are described in natural language. The requests in our collection are manually collected from social sharing websites, and are annotated with various types of metadata, including location, categories, constraints, and example POIs. These requests are to be resolved from a dataset of POIs, which are collected from a popular online directory, and are further linked to a geographical knowledge base and enriched with relevant web snippets. Graded relevance assessments are collected using crowdsourcing, by pooling both manual and automatic recommendations, where the latter serve as baselines for future performance comparison. This resource supports the development of novel approaches for end-to-end POI recommendation as well as for specific semantic annotation tasks on natural language requests.
△ Less
Submitted 19 May, 2021;
originally announced May 2021.