Cancellation-Free Regret Bounds for Lagrangian Approaches in Constrained Markov Decision Processes

Müller, Adrian; Alatur, Pragnya; Ramponi, Giorgia; He, Niao

Computer Science > Machine Learning

arXiv:2306.07001 (cs)

[Submitted on 12 Jun 2023 (v1), last revised 30 Aug 2023 (this version, v2)]

Title:Cancellation-Free Regret Bounds for Lagrangian Approaches in Constrained Markov Decision Processes

Authors:Adrian Müller, Pragnya Alatur, Giorgia Ramponi, Niao He

View PDF

Abstract:Constrained Markov Decision Processes (CMDPs) are one of the common ways to model safe reinforcement learning problems, where constraint functions model the safety objectives. Lagrangian-based dual or primal-dual algorithms provide efficient methods for learning in CMDPs. For these algorithms, the currently known regret bounds in the finite-horizon setting allow for a "cancellation of errors"; one can compensate for a constraint violation in one episode with a strict constraint satisfaction in another. However, we do not consider such a behavior safe in practical applications. In this paper, we overcome this weakness by proposing a novel model-based dual algorithm OptAug-CMDP for tabular finite-horizon CMDPs. Our algorithm is motivated by the augmented Lagrangian method and can be performed efficiently. We show that during $K$ episodes of exploring the CMDP, our algorithm obtains a regret of $\tilde{O}(\sqrt{K})$ for both the objective and the constraint violation. Unlike existing Lagrangian approaches, our algorithm achieves this regret without the need for the cancellation of errors.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2306.07001 [cs.LG]
	(or arXiv:2306.07001v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.07001

Submission history

From: Adrian Müller [view email]
[v1] Mon, 12 Jun 2023 10:10:57 UTC (43 KB)
[v2] Wed, 30 Aug 2023 15:58:45 UTC (47 KB)

Computer Science > Machine Learning

Title:Cancellation-Free Regret Bounds for Lagrangian Approaches in Constrained Markov Decision Processes

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Cancellation-Free Regret Bounds for Lagrangian Approaches in Constrained Markov Decision Processes

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators