Skip to main content

Showing 1–3 of 3 results for author: Tolguenec, P L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.14085  [pdf, other

    cs.LG

    Exploration by Running Away from the Past

    Authors: Paul-Antoine Le Tolguenec, Yann Besse, Florent Teichteil-Koenigsbuch, Dennis G. Wilson, Emmanuel Rachelson

    Abstract: The ability to explore efficiently and effectively is a central challenge of reinforcement learning. In this work, we consider exploration through the lens of information theory. Specifically, we cast exploration as a problem of maximizing the Shannon entropy of the state occupation measure. This is done by maximizing a sequence of divergences between distributions representing an agent's past beh… ▽ More

    Submitted 21 November, 2024; originally announced November 2024.

  2. arXiv:2406.10127  [pdf, other

    cs.AI cs.RO

    Exploration by Learning Diverse Skills through Successor State Measures

    Authors: Paul-Antoine Le Tolguenec, Yann Besse, Florent Teichteil-Konigsbuch, Dennis G. Wilson, Emmanuel Rachelson

    Abstract: The ability to perform different skills can encourage agents to explore. In this work, we aim to construct a set of diverse skills which uniformly cover the state space. We propose a formalization of this search for diverse skills, building on a previous definition based on the mutual information between states and skills. We consider the distribution of states reached by a policy conditioned on e… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  3. arXiv:2212.03530  [pdf, other

    cs.NE cs.AI

    Curiosity creates Diversity in Policy Search

    Authors: Paul-Antoine Le Tolguenec, Emmanuel Rachelson, Yann Besse, Dennis G. Wilson

    Abstract: When searching for policies, reward-sparse environments often lack sufficient information about which behaviors to improve upon or avoid. In such environments, the policy search process is bound to blindly search for reward-yielding transitions and no early reward can bias this search in one direction or another. A way to overcome this is to use intrinsic motivation in order to explore new transit… ▽ More

    Submitted 15 July, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

    Comments: Transactions on Evolutionary Learning and Optimization. 2023