-
From Goal-Conditioned to Language-Conditioned Agents via Vision-Language Models
Authors:
Theo Cachet,
Christopher R. Dance,
Olivier Sigaud
Abstract:
Vision-language models (VLMs) have tremendous potential for grounding language, and thus enabling language-conditioned agents (LCAs) to perform diverse tasks specified with text. This has motivated the study of LCAs based on reinforcement learning (RL) with rewards given by rendering images of an environment and evaluating those images with VLMs. If single-task RL is employed, such approaches are…
▽ More
Vision-language models (VLMs) have tremendous potential for grounding language, and thus enabling language-conditioned agents (LCAs) to perform diverse tasks specified with text. This has motivated the study of LCAs based on reinforcement learning (RL) with rewards given by rendering images of an environment and evaluating those images with VLMs. If single-task RL is employed, such approaches are limited by the cost and time required to train a policy for each new task. Multi-task RL (MTRL) is a natural alternative, but requires a carefully designed corpus of training tasks and does not always generalize reliably to new tasks. Therefore, this paper introduces a novel decomposition of the problem of building an LCA: first find an environment configuration that has a high VLM score for text describing a task; then use a (pretrained) goal-conditioned policy to reach that configuration. We also explore several enhancements to the speed and quality of VLM-based LCAs, notably, the use of distilled models, and the evaluation of configurations from multiple viewpoints to resolve the ambiguities inherent in a single 2D view. We demonstrate our approach on the Humanoid environment, showing that it results in LCAs that outperform MTRL baselines in zero-shot generalization, without requiring any textual task descriptions or other forms of environment-specific annotation during training.
Videos and an interactive demo can be found at https://europe.naverlabs.com/text2control
△ Less
Submitted 26 November, 2024; v1 submitted 24 September, 2024;
originally announced September 2024.
-
Risk-Conditioned Distributional Soft Actor-Critic for Risk-Sensitive Navigation
Authors:
Jinyoung Choi,
Christopher R. Dance,
Jung-eun Kim,
Seulbin Hwang,
Kyung-sik Park
Abstract:
Modern navigation algorithms based on deep reinforcement learning (RL) show promising efficiency and robustness. However, most deep RL algorithms operate in a risk-neutral manner, making no special attempt to shield users from relatively rare but serious outcomes, even if such shielding might cause little loss of performance. Furthermore, such algorithms typically make no provisions to ensure safe…
▽ More
Modern navigation algorithms based on deep reinforcement learning (RL) show promising efficiency and robustness. However, most deep RL algorithms operate in a risk-neutral manner, making no special attempt to shield users from relatively rare but serious outcomes, even if such shielding might cause little loss of performance. Furthermore, such algorithms typically make no provisions to ensure safety in the presence of inaccuracies in the models on which they were trained, beyond adding a cost-of-collision and some domain randomization while training, in spite of the formidable complexity of the environments in which they operate. In this paper, we present a novel distributional RL algorithm that not only learns an uncertainty-aware policy, but can also change its risk measure without expensive fine-tuning or retraining. Our method shows superior performance and safety over baselines in partially-observed navigation tasks. We also demonstrate that agents trained using our method can adapt their policies to a wide range of risk measures at run-time.
△ Less
Submitted 9 April, 2021; v1 submitted 7 April, 2021;
originally announced April 2021.
-
From handcrafted to deep local features
Authors:
Gabriela Csurka,
Christopher R. Dance,
Martin Humenberger
Abstract:
This paper presents an overview of the evolution of local features from handcrafted to deep-learning-based methods, followed by a discussion of several benchmarks and papers evaluating such local features. Our investigations are motivated by 3D reconstruction problems, where the precise location of the features is important. As we describe these methods, we highlight and explain the challenges of…
▽ More
This paper presents an overview of the evolution of local features from handcrafted to deep-learning-based methods, followed by a discussion of several benchmarks and papers evaluating such local features. Our investigations are motivated by 3D reconstruction problems, where the precise location of the features is important. As we describe these methods, we highlight and explain the challenges of feature extraction and potential ways to overcome them. We first present handcrafted methods, followed by methods based on classical machine learning and finally we discuss methods based on deep-learning. This largely chronologically-ordered presentation will help the reader to fully understand the topic of image and region description in order to make best use of it in modern computer vision applications. In particular, understanding handcrafted methods and their motivation can help to understand modern approaches and how machine learning is used to improve the results. We also provide references to most of the relevant literature and code.
△ Less
Submitted 14 June, 2019; v1 submitted 26 July, 2018;
originally announced July 2018.
-
On Inductive Abilities of Latent Factor Models for Relational Learning
Authors:
Théo Trouillon,
Éric Gaussier,
Christopher R. Dance,
Guillaume Bouchard
Abstract:
Latent factor models are increasingly popular for modeling multi-relational knowledge graphs. By their vectorial nature, it is not only hard to interpret why this class of models works so well, but also to understand where they fail and how they might be improved. We conduct an experimental survey of state-of-the-art models, not towards a purely comparative end, but as a means to get insight about…
▽ More
Latent factor models are increasingly popular for modeling multi-relational knowledge graphs. By their vectorial nature, it is not only hard to interpret why this class of models works so well, but also to understand where they fail and how they might be improved. We conduct an experimental survey of state-of-the-art models, not towards a purely comparative end, but as a means to get insight about their inductive abilities. To assess the strengths and weaknesses of each model, we create simple tasks that exhibit first, atomic properties of binary relations, and then, common inter-relational inference through synthetic genealogies. Based on these experimental results, we propose new research directions to improve on existing models.
△ Less
Submitted 17 September, 2017;
originally announced September 2017.
-
Knowledge Graph Completion via Complex Tensor Factorization
Authors:
Théo Trouillon,
Christopher R. Dance,
Johannes Welbl,
Sebastian Riedel,
Éric Gaussier,
Guillaume Bouchard
Abstract:
In statistical relational learning, knowledge graph completion deals with automatically understanding the structure of large knowledge graphs---labeled directed graphs---and predicting missing relationships---labeled edges. State-of-the-art embedding models propose different trade-offs between modeling expressiveness, and time and space complexity. We reconcile both expressiveness and complexity t…
▽ More
In statistical relational learning, knowledge graph completion deals with automatically understanding the structure of large knowledge graphs---labeled directed graphs---and predicting missing relationships---labeled edges. State-of-the-art embedding models propose different trade-offs between modeling expressiveness, and time and space complexity. We reconcile both expressiveness and complexity through the use of complex-valued embeddings and explore the link between such complex-valued embeddings and unitary diagonalization. We corroborate our approach theoretically and show that all real square matrices---thus all possible relation/adjacency matrices---are the real part of some unitarily diagonalizable matrix. This results opens the door to a lot of other applications of square matrices factorization. Our approach based on complex embeddings is arguably simple, as it only involves a Hermitian dot product, the complex counterpart of the standard dot product between real vectors, whereas other methods resort to more and more complicated composition functions to increase their expressiveness. The proposed complex embeddings are scalable to large data sets as it remains linear in both space and time, while consistently outperforming alternative approaches on standard link prediction benchmarks.
△ Less
Submitted 26 November, 2017; v1 submitted 22 February, 2017;
originally announced February 2017.
-
Dynamic Mechanism Design with Interdependent Valuations
Authors:
Swaprava Nath,
Onno Zoeter,
Y. Narahari,
Christopher R. Dance
Abstract:
We consider an infinite horizon dynamic mechanism design problem with interdependent valuations. In this setting the type of each agent is assumed to be evolving according to a first order Markov process and is independent of the types of other agents. However, the valuation of an agent can depend on the types of other agents, which makes the problem fall into an interdependent valuation setting.…
▽ More
We consider an infinite horizon dynamic mechanism design problem with interdependent valuations. In this setting the type of each agent is assumed to be evolving according to a first order Markov process and is independent of the types of other agents. However, the valuation of an agent can depend on the types of other agents, which makes the problem fall into an interdependent valuation setting. Designing truthful mechanisms in this setting is non-trivial in view of an impossibility result which says that for interdependent valuations, any efficient and ex-post incentive compatible mechanism must be a constant mechanism, even in a static setting. Mezzetti (2004) circumvents this problem by splitting the decisions of allocation and payment into two stages. However, Mezzetti's result is limited to a static setting and moreover in the second stage of that mechanism, agents are weakly indifferent about reporting their valuations truthfully. This paper provides a first attempt at designing a dynamic mechanism which is efficient, `strict' ex-post incentive compatible and ex-post individually rational in a setting with interdependent values and Markovian type evolution.
△ Less
Submitted 25 June, 2015;
originally announced June 2015.
-
Dynamic Mechanism Design for Markets with Strategic Resources
Authors:
Swaprava Nath,
Onno Zoeter,
Yadati Narahari,
Christopher R. Dance
Abstract:
The assignment of tasks to multiple resources becomes an interesting game theoretic problem, when both the task owner and the resources are strategic. In the classical, nonstrategic setting, where the states of the tasks and resources are observable by the controller, this problem is that of finding an optimal policy for a Markov decision process (MDP). When the states are held by strategic agents…
▽ More
The assignment of tasks to multiple resources becomes an interesting game theoretic problem, when both the task owner and the resources are strategic. In the classical, nonstrategic setting, where the states of the tasks and resources are observable by the controller, this problem is that of finding an optimal policy for a Markov decision process (MDP). When the states are held by strategic agents, the problem of an efficient task allocation extends beyond that of solving an MDP and becomes that of designing a mechanism. Motivated by this fact, we propose a general mechanism which decides on an allocation rule for the tasks and resources and a payment rule to incentivize agents' participation and truthful reports. In contrast to related dynamic strategic control problems studied in recent literature, the problem studied here has interdependent values: the benefit of an allocation to the task owner is not simply a function of the characteristics of the task itself and the allocation, but also of the state of the resources. We introduce a dynamic extension of Mezzetti's two phase mechanism for interdependent valuations. In this changed setting, the proposed dynamic mechanism is efficient, within period ex-post incentive compatible, and within period ex-post individually rational.
△ Less
Submitted 14 February, 2012;
originally announced February 2012.