Skip to main content

Showing 1–3 of 3 results for author: Teichteil-Konigsbuch, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.14085  [pdf, other

    cs.LG

    Exploration by Running Away from the Past

    Authors: Paul-Antoine Le Tolguenec, Yann Besse, Florent Teichteil-Koenigsbuch, Dennis G. Wilson, Emmanuel Rachelson

    Abstract: The ability to explore efficiently and effectively is a central challenge of reinforcement learning. In this work, we consider exploration through the lens of information theory. Specifically, we cast exploration as a problem of maximizing the Shannon entropy of the state occupation measure. This is done by maximizing a sequence of divergences between distributions representing an agent's past beh… ▽ More

    Submitted 21 November, 2024; originally announced November 2024.

  2. arXiv:2406.10127  [pdf, other

    cs.AI cs.RO

    Exploration by Learning Diverse Skills through Successor State Measures

    Authors: Paul-Antoine Le Tolguenec, Yann Besse, Florent Teichteil-Konigsbuch, Dennis G. Wilson, Emmanuel Rachelson

    Abstract: The ability to perform different skills can encourage agents to explore. In this work, we aim to construct a set of diverse skills which uniformly cover the state space. We propose a formalization of this search for diverse skills, building on a previous definition based on the mutual information between states and skills. We consider the distribution of states reached by a policy conditioned on e… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  3. arXiv:1309.6826  [pdf

    cs.AI

    Qualitative Possibilistic Mixed-Observable MDPs

    Authors: Nicolas Drougard, Florent Teichteil-Konigsbuch, Jean-Loup Farges, Didier Dubois

    Abstract: Possibilistic and qualitative POMDPs (pi-POMDPs) are counterparts of POMDPs used to model situations where the agent's initial belief or observation probabilities are imprecise due to lack of past experiences or insufficient data collection. However, like probabilistic POMDPs, optimally solving pi-POMDPs is intractable: the finite belief state space exponentially grows with the number of system's… ▽ More

    Submitted 26 September, 2013; originally announced September 2013.

    Comments: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

    Report number: UAI-P-2013-PG-192-201