Model-Based Reinforcement Learning for Offline Zero-Sum Markov Games

Yan, Yuling; Li, Gen; Chen, Yuxin; Fan, Jianqing

Computer Science > Machine Learning

arXiv:2206.04044 (cs)

[Submitted on 8 Jun 2022 (v1), last revised 26 Feb 2024 (this version, v2)]

Title:Model-Based Reinforcement Learning for Offline Zero-Sum Markov Games

Authors:Yuling Yan, Gen Li, Yuxin Chen, Jianqing Fan

View PDF

Abstract:This paper makes progress towards learning Nash equilibria in two-player zero-sum Markov games from offline data. Specifically, consider a $\gamma$-discounted infinite-horizon Markov game with $S$ states, where the max-player has $A$ actions and the min-player has $B$ actions. We propose a pessimistic model-based algorithm with Bernstein-style lower confidence bounds -- called VI-LCB-Game -- that provably finds an $\varepsilon$-approximate Nash equilibrium with a sample complexity no larger than $\frac{C_{\mathsf{clipped}}^{\star}S(A+B)}{(1-\gamma)^{3}\varepsilon^{2}}$ (up to some log factor). Here, $C_{\mathsf{clipped}}^{\star}$ is some unilateral clipped concentrability coefficient that reflects the coverage and distribution shift of the available data (vis-à-vis the target data), and the target accuracy $\varepsilon$ can be any value within $\big(0,\frac{1}{1-\gamma}\big]$. Our sample complexity bound strengthens prior art by a factor of $\min\{A,B\}$, achieving minimax optimality for the entire $\varepsilon$-range. An appealing feature of our result lies in algorithmic simplicity, which reveals the unnecessity of variance reduction and sample splitting in achieving sample optimality.

Comments:	accepted to Operations Research
Subjects:	Machine Learning (cs.LG); Computer Science and Game Theory (cs.GT); Information Theory (cs.IT); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:2206.04044 [cs.LG]
	(or arXiv:2206.04044v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2206.04044
Journal reference:	Operations Research, vol. 72, no. 6, pp. 2430-2445, 2024

Submission history

From: Yuling Yan [view email]
[v1] Wed, 8 Jun 2022 17:58:06 UTC (108 KB)
[v2] Mon, 26 Feb 2024 15:18:31 UTC (56 KB)

Computer Science > Machine Learning

Title:Model-Based Reinforcement Learning for Offline Zero-Sum Markov Games

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Model-Based Reinforcement Learning for Offline Zero-Sum Markov Games

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators