Model-based Offline Reinforcement Learning with Count-based Conservatism

Kim, Byeongchan; Oh, Min-hwan

Computer Science > Machine Learning

arXiv:2307.11352 (cs)

[Submitted on 21 Jul 2023]

Title:Model-based Offline Reinforcement Learning with Count-based Conservatism

Authors:Byeongchan Kim, Min-hwan Oh

View PDF

Abstract:In this paper, we propose a model-based offline reinforcement learning method that integrates count-based conservatism, named $\texttt{Count-MORL}$. Our method utilizes the count estimates of state-action pairs to quantify model estimation error, marking the first algorithm of demonstrating the efficacy of count-based conservatism in model-based offline deep RL to the best of our knowledge. For our proposed method, we first show that the estimation error is inversely proportional to the frequency of state-action pairs. Secondly, we demonstrate that the learned policy under the count-based conservative model offers near-optimality performance guarantees. Through extensive numerical experiments, we validate that $\texttt{Count-MORL}$ with hash code implementation significantly outperforms existing offline RL algorithms on the D4RL benchmark datasets. The code is accessible at $\href{this https URL}{this https URL}$.

Comments:	Accepted in ICML 2023
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2307.11352 [cs.LG]
	(or arXiv:2307.11352v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2307.11352

Submission history

From: Byeongchan Kim [view email]
[v1] Fri, 21 Jul 2023 04:59:23 UTC (2,347 KB)

Computer Science > Machine Learning

Title:Model-based Offline Reinforcement Learning with Count-based Conservatism

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Model-based Offline Reinforcement Learning with Count-based Conservatism

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators