Local Policy Improvement for Recommender Systems

Liang, Dawen; Vlassis, Nikos

Computer Science > Machine Learning

arXiv:2212.11431 (cs)

[Submitted on 22 Dec 2022 (v1), last revised 26 Apr 2023 (this version, v2)]

Title:Local Policy Improvement for Recommender Systems

Authors:Dawen Liang, Nikos Vlassis

View PDF

Abstract:Recommender systems predict what items a user will interact with next, based on their past interactions. The problem is often approached through supervised learning, but recent advancements have shifted towards policy optimization of rewards (e.g., user engagement). One challenge with the latter is policy mismatch: we are only able to train a new policy given data collected from a previously-deployed policy. The conventional way to address this problem is through importance sampling correction, but this comes with practical limitations. We suggest an alternative approach of local policy improvement without off-policy correction. Our method computes and optimizes a lower bound of expected reward of the target policy, which is easy to estimate from data and does not involve density ratios (such as those appearing in importance sampling correction). This local policy improvement paradigm is ideal for recommender systems, as previous policies are typically of decent quality and policies are updated frequently. We provide empirical evidence and practical recipes for applying our technique in a sequential recommendation setting.

Subjects:	Machine Learning (cs.LG); Information Retrieval (cs.IR)
Cite as:	arXiv:2212.11431 [cs.LG]
	(or arXiv:2212.11431v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2212.11431

Submission history

From: Dawen Liang [view email]
[v1] Thu, 22 Dec 2022 00:47:40 UTC (123 KB)
[v2] Wed, 26 Apr 2023 22:49:15 UTC (169 KB)

Computer Science > Machine Learning

Title:Local Policy Improvement for Recommender Systems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Local Policy Improvement for Recommender Systems

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators