Reward Certification for Policy Smoothed Reinforcement Learning

Mu, Ronghui; Marcolino, Leandro Soriano; Zhang, Tianle; Zhang, Yanghao; Huang, Xiaowei; Ruan, Wenjie

Computer Science > Machine Learning

arXiv:2312.06436 (cs)

[Submitted on 11 Dec 2023 (v1), last revised 12 Dec 2023 (this version, v2)]

Title:Reward Certification for Policy Smoothed Reinforcement Learning

Authors:Ronghui Mu, Leandro Soriano Marcolino, Tianle Zhang, Yanghao Zhang, Xiaowei Huang, Wenjie Ruan

View PDF HTML (experimental)

Abstract:Reinforcement Learning (RL) has achieved remarkable success in safety-critical areas, but it can be weakened by adversarial attacks. Recent studies have introduced "smoothed policies" in order to enhance its robustness. Yet, it is still challenging to establish a provable guarantee to certify the bound of its total reward. Prior methods relied primarily on computing bounds using Lipschitz continuity or calculating the probability of cumulative reward above specific thresholds. However, these techniques are only suited for continuous perturbations on the RL agent's observations and are restricted to perturbations bounded by the $l_2$-norm. To address these limitations, this paper proposes a general black-box certification method capable of directly certifying the cumulative reward of the smoothed policy under various $l_p$-norm bounded perturbations. Furthermore, we extend our methodology to certify perturbations on action spaces. Our approach leverages f-divergence to measure the distinction between the original distribution and the perturbed distribution, subsequently determining the certification bound by solving a convex optimisation problem. We provide a comprehensive theoretical analysis and run sufficient experiments in multiple environments. Our results show that our method not only improves the certified lower bound of mean cumulative reward but also demonstrates better efficiency than state-of-the-art techniques.

Comments:	This paper will be presented in AAAI2024
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2312.06436 [cs.LG]
	(or arXiv:2312.06436v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2312.06436

Submission history

From: Ronghui Mu [view email]
[v1] Mon, 11 Dec 2023 15:07:58 UTC (837 KB)
[v2] Tue, 12 Dec 2023 12:19:31 UTC (1,000 KB)

Computer Science > Machine Learning

Title:Reward Certification for Policy Smoothed Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Reward Certification for Policy Smoothed Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators