Showing 1–1 of 1 results for author: Olesen, T V

Search v0.5.6 released 2020-02-24

arXiv:2011.11293 [pdf, other]

cs.LG cs.NE

Evolutionary Planning in Latent Space

Authors: Thor V. A. N. Olesen, Dennis T. T. Nguyen, Rasmus Berg Palm, Sebastian Risi

Abstract: Planning is a powerful approach to reinforcement learning with several desirable properties. However, it requires a model of the world, which is not readily available in many real-life problems. In this paper, we propose to learn a world model that enables Evolutionary Planning in Latent Space (EPLS). We use a Variational Auto Encoder (VAE) to learn a compressed latent representation of individual… ▽ More Planning is a powerful approach to reinforcement learning with several desirable properties. However, it requires a model of the world, which is not readily available in many real-life problems. In this paper, we propose to learn a world model that enables Evolutionary Planning in Latent Space (EPLS). We use a Variational Auto Encoder (VAE) to learn a compressed latent representation of individual observations and extend a Mixture Density Recurrent Neural Network (MDRNN) to learn a stochastic, multi-modal forward model of the world that can be used for planning. We use the Random Mutation Hill Climbing (RMHC) to find a sequence of actions that maximize expected reward in this learned model of the world. We demonstrate how to build a model of the world by bootstrapping it with rollouts from a random policy and iteratively refining it with rollouts from an increasingly accurate planning policy using the learned world model. After a few iterations of this refinement, our planning agents are better than standard model-free reinforcement learning approaches demonstrating the viability of our approach. △ Less

Submitted 23 November, 2020; originally announced November 2020.

Comments: Code to reproduce the experiments are available at https://github.com/two2tee/WorldModelPlanning Video of driving performance is available at https://youtu.be/3M39QgeF27U

Search v0.5.6 released 2020-02-24