Overcoming Non-stationary Dynamics with Evidential Proximal Policy Optimization

Akgül, Abdullah; Baykal, Gulcin; Haußmann, Manuel; Kandemir, Melih

Abstract:Continuous control of non-stationary environments is a major challenge for deep reinforcement learning algorithms. The time-dependency of the state transition dynamics aggravates the notorious stability problems of model-free deep actor-critic architectures. We posit that two properties will play a key role in overcoming non-stationarity in transition dynamics: (i) preserving the plasticity of the critic network, (ii) directed exploration for rapid adaptation to the changing dynamics. We show that performing on-policy reinforcement learning with an evidential critic provides both of these properties. The evidential design ensures a fast and sufficiently accurate approximation to the uncertainty around the state-value, which maintains the plasticity of the critic network by detecting the distributional shifts caused by the change in dynamics. The probabilistic critic also makes the actor training objective a random variable, enabling the use of directed exploration approaches as a by-product. We name the resulting algorithm as $\textit{ Evidential Proximal Policy Optimization (EPPO)}$ due to the integral role of evidential uncertainty quantification in both policy evaluation and policy improvement stages. Through experiments on non-stationary continuous control tasks, where the environment dynamics change at regular intervals, we demonstrate that our algorithm outperforms state-of-the-art on-policy reinforcement learning variants in both task-specific and overall return.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2503.01468 [cs.LG]
	(or arXiv:2503.01468v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2503.01468

Computer Science > Machine Learning

Title:Overcoming Non-stationary Dynamics with Evidential Proximal Policy Optimization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators