-
FLoRA: Sample-Efficient Preference-based RL via Low-Rank Style Adaptation of Reward Functions
Authors:
Daniel Marta,
Simon Holk,
Miguel Vasco,
Jens Lundell,
Timon Homberger,
Finn Busch,
Olov Andersson,
Danica Kragic,
Iolanda Leite
Abstract:
Preference-based reinforcement learning (PbRL) is a suitable approach for style adaptation of pre-trained robotic behavior: adapting the robot's policy to follow human user preferences while still being able to perform the original task. However, collecting preferences for the adaptation process in robotics is often challenging and time-consuming. In this work we explore the adaptation of pre-trai…
▽ More
Preference-based reinforcement learning (PbRL) is a suitable approach for style adaptation of pre-trained robotic behavior: adapting the robot's policy to follow human user preferences while still being able to perform the original task. However, collecting preferences for the adaptation process in robotics is often challenging and time-consuming. In this work we explore the adaptation of pre-trained robots in the low-preference-data regime. We show that, in this regime, recent adaptation approaches suffer from catastrophic reward forgetting (CRF), where the updated reward model overfits to the new preferences, leading the agent to become unable to perform the original task. To mitigate CRF, we propose to enhance the original reward model with a small number of parameters (low-rank matrices) responsible for modeling the preference adaptation. Our evaluation shows that our method can efficiently and effectively adjust robotic behavior to human preferences across simulation benchmark tasks and multiple real-world robotic tasks.
△ Less
Submitted 14 April, 2025;
originally announced April 2025.
-
FLAME: A Federated Learning Benchmark for Robotic Manipulation
Authors:
Santiago Bou Betran,
Alberta Longhini,
Miguel Vasco,
Yuchong Zhang,
Danica Kragic
Abstract:
Recent progress in robotic manipulation has been fueled by large-scale datasets collected across diverse environments. Training robotic manipulation policies on these datasets is traditionally performed in a centralized manner, raising concerns regarding scalability, adaptability, and data privacy. While federated learning enables decentralized, privacy-preserving training, its application to robo…
▽ More
Recent progress in robotic manipulation has been fueled by large-scale datasets collected across diverse environments. Training robotic manipulation policies on these datasets is traditionally performed in a centralized manner, raising concerns regarding scalability, adaptability, and data privacy. While federated learning enables decentralized, privacy-preserving training, its application to robotic manipulation remains largely unexplored. We introduce FLAME (Federated Learning Across Manipulation Environments), the first benchmark designed for federated learning in robotic manipulation. FLAME consists of: (i) a set of large-scale datasets of over 160,000 expert demonstrations of multiple manipulation tasks, collected across a wide range of simulated environments; (ii) a training and evaluation framework for robotic policy learning in a federated setting. We evaluate standard federated learning algorithms in FLAME, showing their potential for distributed policy learning and highlighting key challenges. Our benchmark establishes a foundation for scalable, adaptive, and privacy-aware robotic learning.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
Humans Coexist, So Must Embodied Artificial Agents
Authors:
Hannah Kuehn,
Joseph La Delfa,
Miguel Vasco,
Danica Kragic,
Iolanda Leite
Abstract:
This paper introduces the concept of coexistence for embodied artificial agents and argues that it is a prerequisite for long-term, in-the-wild interaction with humans. Contemporary embodied artificial agents excel in static, predefined tasks but fall short in dynamic and long-term interactions with humans. On the other hand, humans can adapt and evolve continuously, exploiting the situated knowle…
▽ More
This paper introduces the concept of coexistence for embodied artificial agents and argues that it is a prerequisite for long-term, in-the-wild interaction with humans. Contemporary embodied artificial agents excel in static, predefined tasks but fall short in dynamic and long-term interactions with humans. On the other hand, humans can adapt and evolve continuously, exploiting the situated knowledge embedded in their environment and other agents, thus contributing to meaningful interactions. We take an interdisciplinary approach at different levels of organization, drawing from biology and design theory, to understand how human and non-human organisms foster entities that coexist within their specific environments. Finally, we propose key research directions for the artificial intelligence community to develop coexisting embodied agents, focusing on the principles, hardware and learning methods responsible for shaping them.
△ Less
Submitted 2 June, 2025; v1 submitted 7 February, 2025;
originally announced February 2025.
-
Human-Aligned Image Models Improve Visual Decoding from the Brain
Authors:
Nona Rajabi,
Antônio H. Ribeiro,
Miguel Vasco,
Farzaneh Taleb,
Mårten Björkman,
Danica Kragic
Abstract:
Decoding visual images from brain activity has significant potential for advancing brain-computer interaction and enhancing the understanding of human perception. Recent approaches align the representation spaces of images and brain activity to enable visual decoding. In this paper, we introduce the use of human-aligned image encoders to map brain signals to images. We hypothesize that these model…
▽ More
Decoding visual images from brain activity has significant potential for advancing brain-computer interaction and enhancing the understanding of human perception. Recent approaches align the representation spaces of images and brain activity to enable visual decoding. In this paper, we introduce the use of human-aligned image encoders to map brain signals to images. We hypothesize that these models more effectively capture perceptual attributes associated with the rapid visual stimuli presentations commonly used in visual brain data recording experiments. Our empirical results support this hypothesis, demonstrating that this simple modification improves image retrieval accuracy by up to 21% compared to state-of-the-art methods. Comprehensive experiments confirm consistent performance improvements across diverse EEG architectures, image encoders, alignment methods, participants, and brain imaging modalities
△ Less
Submitted 10 June, 2025; v1 submitted 5 February, 2025;
originally announced February 2025.
-
Can Transformers Smell Like Humans?
Authors:
Farzaneh Taleb,
Miguel Vasco,
Antônio H. Ribeiro,
Mårten Björkman,
Danica Kragic
Abstract:
The human brain encodes stimuli from the environment into representations that form a sensory perception of the world. Despite recent advances in understanding visual and auditory perception, olfactory perception remains an under-explored topic in the machine learning community due to the lack of large-scale datasets annotated with labels of human olfactory perception. In this work, we ask the que…
▽ More
The human brain encodes stimuli from the environment into representations that form a sensory perception of the world. Despite recent advances in understanding visual and auditory perception, olfactory perception remains an under-explored topic in the machine learning community due to the lack of large-scale datasets annotated with labels of human olfactory perception. In this work, we ask the question of whether pre-trained transformer models of chemical structures encode representations that are aligned with human olfactory perception, i.e., can transformers smell like humans? We demonstrate that representations encoded from transformers pre-trained on general chemical structures are highly aligned with human olfactory perception. We use multiple datasets and different types of perceptual representations to show that the representations encoded by transformer models are able to predict: (i) labels associated with odorants provided by experts; (ii) continuous ratings provided by human participants with respect to pre-defined descriptors; and (iii) similarity ratings between odorants provided by human participants. Finally, we evaluate the extent to which this alignment is associated with physicochemical features of odorants known to be relevant for olfactory decoding.
△ Less
Submitted 5 November, 2024;
originally announced November 2024.
-
Reducing Variance in Meta-Learning via Laplace Approximation for Regression Tasks
Authors:
Alfredo Reichlin,
Gustaf Tegnér,
Miguel Vasco,
Hang Yin,
Mårten Björkman,
Danica Kragic
Abstract:
Given a finite set of sample points, meta-learning algorithms aim to learn an optimal adaptation strategy for new, unseen tasks. Often, this data can be ambiguous as it might belong to different tasks concurrently. This is particularly the case in meta-regression tasks. In such cases, the estimated adaptation strategy is subject to high variance due to the limited amount of support data for each t…
▽ More
Given a finite set of sample points, meta-learning algorithms aim to learn an optimal adaptation strategy for new, unseen tasks. Often, this data can be ambiguous as it might belong to different tasks concurrently. This is particularly the case in meta-regression tasks. In such cases, the estimated adaptation strategy is subject to high variance due to the limited amount of support data for each task, which often leads to sub-optimal generalization performance. In this work, we address the problem of variance reduction in gradient-based meta-learning and formalize the class of problems prone to this, a condition we refer to as \emph{task overlap}. Specifically, we propose a novel approach that reduces the variance of the gradient estimate by weighing each support point individually by the variance of its posterior over the parameters. To estimate the posterior, we utilize the Laplace approximation, which allows us to express the variance in terms of the curvature of the loss landscape of our meta-learner. Experimental results demonstrate the effectiveness of the proposed method and highlight the importance of variance reduction in meta-learning.
△ Less
Submitted 23 October, 2024; v1 submitted 2 October, 2024;
originally announced October 2024.
-
A Super-human Vision-based Reinforcement Learning Agent for Autonomous Racing in Gran Turismo
Authors:
Miguel Vasco,
Takuma Seno,
Kenta Kawamoto,
Kaushik Subramanian,
Peter R. Wurman,
Peter Stone
Abstract:
Racing autonomous cars faster than the best human drivers has been a longstanding grand challenge for the fields of Artificial Intelligence and robotics. Recently, an end-to-end deep reinforcement learning agent met this challenge in a high-fidelity racing simulator, Gran Turismo. However, this agent relied on global features that require instrumentation external to the car. This paper introduces,…
▽ More
Racing autonomous cars faster than the best human drivers has been a longstanding grand challenge for the fields of Artificial Intelligence and robotics. Recently, an end-to-end deep reinforcement learning agent met this challenge in a high-fidelity racing simulator, Gran Turismo. However, this agent relied on global features that require instrumentation external to the car. This paper introduces, to the best of our knowledge, the first super-human car racing agent whose sensor input is purely local to the car, namely pixels from an ego-centric camera view and quantities that can be sensed from on-board the car, such as the car's velocity. By leveraging global features only at training time, the learned agent is able to outperform the best human drivers in time trial (one car on the track at a time) races using only local input features. The resulting agent is evaluated in Gran Turismo 7 on multiple tracks and cars. Detailed ablation experiments demonstrate the agent's strong reliance on visual inputs, making it the first vision-based super-human car racing agent.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Will You Participate? Exploring the Potential of Robotics Competitions on Human-centric Topics
Authors:
Yuchong Zhang,
Miguel Vasco,
Mårten Björkman,
Danica Kragic
Abstract:
This paper presents findings from an exploratory needfinding study investigating the research current status and potential participation of the competitions on the robotics community towards four human-centric topics: safety, privacy, explainability, and federated learning. We conducted a survey with 34 participants across three distinguished European robotics consortia, nearly 60% of whom possess…
▽ More
This paper presents findings from an exploratory needfinding study investigating the research current status and potential participation of the competitions on the robotics community towards four human-centric topics: safety, privacy, explainability, and federated learning. We conducted a survey with 34 participants across three distinguished European robotics consortia, nearly 60% of whom possessed over five years of research experience in robotics. Our qualitative and quantitative analysis revealed that current mainstream robotic researchers prioritize safety and explainability, expressing a greater willingness to invest in further research in these areas. Conversely, our results indicate that privacy and federated learning garner less attention and are perceived to have lower potential. Additionally, the study suggests a lack of enthusiasm within the robotics community for participating in competitions related to these topics. Based on these findings, we recommend targeting other communities, such as the machine learning community, for future competitions related to these four human-centric topics.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
NeuralSolver: Learning Algorithms For Consistent and Efficient Extrapolation Across General Tasks
Authors:
Bernardo Esteves,
Miguel Vasco,
Francisco S. Melo
Abstract:
We contribute NeuralSolver, a novel recurrent solver that can efficiently and consistently extrapolate, i.e., learn algorithms from smaller problems (in terms of observation size) and execute those algorithms in large problems. Contrary to previous recurrent solvers, NeuralSolver can be naturally applied in both same-size problems, where the input and output sizes are the same, and in different-si…
▽ More
We contribute NeuralSolver, a novel recurrent solver that can efficiently and consistently extrapolate, i.e., learn algorithms from smaller problems (in terms of observation size) and execute those algorithms in large problems. Contrary to previous recurrent solvers, NeuralSolver can be naturally applied in both same-size problems, where the input and output sizes are the same, and in different-size problems, where the size of the input and output differ. To allow for this versatility, we design NeuralSolver with three main components: a recurrent module, that iteratively processes input information at different scales, a processing module, responsible for aggregating the previously processed information, and a curriculum-based training scheme, that improves the extrapolation performance of the method. To evaluate our method we introduce a set of novel different-size tasks and we show that NeuralSolver consistently outperforms the prior state-of-the-art recurrent solvers in extrapolating to larger problems, considering smaller training problems and requiring less parameters than other approaches.
△ Less
Submitted 31 October, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
Learning Goal-Conditioned Policies from Sub-Optimal Offline Data via Metric Learning
Authors:
Alfredo Reichlin,
Miguel Vasco,
Hang Yin,
Danica Kragic
Abstract:
We address the problem of learning optimal behavior from sub-optimal datasets for goal-conditioned offline reinforcement learning. To do so, we propose the use of metric learning to approximate the optimal value function for goal-conditioned offline RL problems under sparse rewards, invertible actions and deterministic transitions. We introduce distance monotonicity, a property for representations…
▽ More
We address the problem of learning optimal behavior from sub-optimal datasets for goal-conditioned offline reinforcement learning. To do so, we propose the use of metric learning to approximate the optimal value function for goal-conditioned offline RL problems under sparse rewards, invertible actions and deterministic transitions. We introduce distance monotonicity, a property for representations to recover optimality and propose an optimization objective that leads to such property. We use the proposed value function to guide the learning of a policy in an actor-critic fashion, a method we name MetricRL. Experimentally, we show that our method estimates optimal behaviors from severely sub-optimal offline datasets without suffering from out-of-distribution estimation errors. We demonstrate that MetricRL consistently outperforms prior state-of-the-art goal-conditioned RL methods in learning optimal policies from sub-optimal offline datasets.
△ Less
Submitted 8 June, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
Applications of Finite non-Abelian Simple Groups to Cryptography in the Quantum Era
Authors:
María Isabel González Vasco,
Delaram Kahrobaei,
Eilidh McKemmie
Abstract:
The theory of finite simple groups is a (rather unexplored) area likely to provide interesting computational problems and modelling tools useful in a cryptographic context. In this note, we review some applications of finite non-abelian simple groups to cryptography and discuss different scenarios in which this theory is clearly central, providing the relevant definitions to make the material acce…
▽ More
The theory of finite simple groups is a (rather unexplored) area likely to provide interesting computational problems and modelling tools useful in a cryptographic context. In this note, we review some applications of finite non-abelian simple groups to cryptography and discuss different scenarios in which this theory is clearly central, providing the relevant definitions to make the material accessible to both cryptographers and group theorists, in the hope of stimulating further interaction between these two (non-disjoint) communities. In particular, we look at constructions based on various group-theoretic factorization problems, review group theoretical hash functions, and discuss fully homomorphic encryption using simple groups. The Hidden Subgroup Problem is also briefly discussed in this context.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning
Authors:
Pedro P. Santos,
Diogo S. Carvalho,
Miguel Vasco,
Alberto Sardinha,
Pedro A. Santos,
Ana Paiva,
Francisco S. Melo
Abstract:
We introduce hybrid execution in multi-agent reinforcement learning (MARL), a new paradigm in which agents aim to successfully complete cooperative tasks with arbitrary communication levels at execution time by taking advantage of information-sharing among the agents. Under hybrid execution, the communication level can range from a setting in which no communication is allowed between agents (fully…
▽ More
We introduce hybrid execution in multi-agent reinforcement learning (MARL), a new paradigm in which agents aim to successfully complete cooperative tasks with arbitrary communication levels at execution time by taking advantage of information-sharing among the agents. Under hybrid execution, the communication level can range from a setting in which no communication is allowed between agents (fully decentralized), to a setting featuring full communication (fully centralized), but the agents do not know beforehand which communication level they will encounter at execution time. To formalize our setting, we define a new class of multi-agent partially observable Markov decision processes (POMDPs) that we name hybrid-POMDPs, which explicitly model a communication process between the agents. We contribute MARO, an approach that makes use of an auto-regressive predictive model, trained in a centralized manner, to estimate missing agents' observations at execution time. We evaluate MARO on standard scenarios and extensions of previous benchmarks tailored to emphasize the negative impact of partial observability in MARL. Experimental results show that our method consistently outperforms relevant baselines, allowing agents to act with faulty communication while successfully exploiting shared information.
△ Less
Submitted 5 June, 2023; v1 submitted 12 October, 2022;
originally announced October 2022.
-
Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories
Authors:
Fábio Vital,
Miguel Vasco,
Alberto Sardinha,
Francisco Melo
Abstract:
We present Perceive-Represent-Generate (PRG), a novel three-stage framework that maps perceptual information of different modalities (e.g., visual or sound), corresponding to a sequence of instructions, to an adequate sequence of movements to be executed by a robot. In the first stage, we perceive and pre-process the given inputs, isolating individual commands from the complete instruction provide…
▽ More
We present Perceive-Represent-Generate (PRG), a novel three-stage framework that maps perceptual information of different modalities (e.g., visual or sound), corresponding to a sequence of instructions, to an adequate sequence of movements to be executed by a robot. In the first stage, we perceive and pre-process the given inputs, isolating individual commands from the complete instruction provided by a human user. In the second stage we encode the individual commands into a multimodal latent space, employing a deep generative model. Finally, in the third stage we convert the multimodal latent values into individual trajectories and combine them into a single dynamic movement primitive, allowing its execution in a robotic platform. We evaluate our pipeline in the context of a novel robotic handwriting task, where the robot receives as input a word through different perceptual modalities (e.g., image, sound), and generates the corresponding motion trajectory to write it, creating coherent and readable handwritten words.
△ Less
Submitted 23 October, 2022; v1 submitted 6 April, 2022;
originally announced April 2022.
-
Geometric Multimodal Contrastive Representation Learning
Authors:
Petra Poklukar,
Miguel Vasco,
Hang Yin,
Francisco S. Melo,
Ana Paiva,
Danica Kragic
Abstract:
Learning representations of multimodal data that are both informative and robust to missing modalities at test time remains a challenging problem due to the inherent heterogeneity of data obtained from different channels. To address it, we present a novel Geometric Multimodal Contrastive (GMC) representation learning method consisting of two main components: i) a two-level architecture consisting…
▽ More
Learning representations of multimodal data that are both informative and robust to missing modalities at test time remains a challenging problem due to the inherent heterogeneity of data obtained from different channels. To address it, we present a novel Geometric Multimodal Contrastive (GMC) representation learning method consisting of two main components: i) a two-level architecture consisting of modality-specific base encoders, allowing to process an arbitrary number of modalities to an intermediate representation of fixed dimensionality, and a shared projection head, mapping the intermediate representations to a latent representation space; ii) a multimodal contrastive loss function that encourages the geometric alignment of the learned representations. We experimentally demonstrate that GMC representations are semantically rich and achieve state-of-the-art performance with missing modality information on three different learning problems including prediction and reinforcement learning tasks.
△ Less
Submitted 18 November, 2022; v1 submitted 7 February, 2022;
originally announced February 2022.
-
Chirotonia: A Scalable and Secure e-Voting Framework based on Blockchains and Linkable Ring Signatures
Authors:
Antonio Russo,
Antonio Fernández Anta,
Maria Isabel González Vasco,
Simon Pietro Romano
Abstract:
In this paper we propose a comprehensive and scalable framework to build secure-by-design e-voting systems. Decentralization, transparency, determinism, and untamperability of votes are granted by dedicated smart contracts on a blockchain, while voter authenticity and anonymity are achieved through (provable secure) linkable ring signatures. These, in combination with suitable smart contract const…
▽ More
In this paper we propose a comprehensive and scalable framework to build secure-by-design e-voting systems. Decentralization, transparency, determinism, and untamperability of votes are granted by dedicated smart contracts on a blockchain, while voter authenticity and anonymity are achieved through (provable secure) linkable ring signatures. These, in combination with suitable smart contract constraints, also grant protection from double voting. Our design is presented in detail, focusing on its security guarantees and the design choices that allow it to scale to a large number of voters. Finally, we present a proof-of-concept implementation of the proposed framework, made available as open source.
△ Less
Submitted 3 November, 2021;
originally announced November 2021.
-
How to Sense the World: Leveraging Hierarchy in Multimodal Perception for Robust Reinforcement Learning Agents
Authors:
Miguel Vasco,
Hang Yin,
Francisco S. Melo,
Ana Paiva
Abstract:
This work addresses the problem of sensing the world: how to learn a multimodal representation of a reinforcement learning agent's environment that allows the execution of tasks under incomplete perceptual conditions. To address such problem, we argue for hierarchy in the design of representation models and contribute with a novel multimodal representation model, MUSE. The proposed model learns hi…
▽ More
This work addresses the problem of sensing the world: how to learn a multimodal representation of a reinforcement learning agent's environment that allows the execution of tasks under incomplete perceptual conditions. To address such problem, we argue for hierarchy in the design of representation models and contribute with a novel multimodal representation model, MUSE. The proposed model learns hierarchical representations: low-level modality-specific representations, encoded from raw observation data, and a high-level multimodal representation, encoding joint-modality information to allow robust state estimation. We employ MUSE as the sensory representation model of deep reinforcement learning agents provided with multimodal observations in Atari games. We perform a comparative study over different designs of reinforcement learning agents, showing that MUSE allows agents to perform tasks under incomplete perceptual experience with minimal performance loss. Finally, we evaluate the performance of MUSE in literature-standard multimodal scenarios with higher number and more complex modalities, showing that it outperforms state-of-the-art multimodal variational autoencoders in single and cross-modality generation.
△ Less
Submitted 31 January, 2022; v1 submitted 7 October, 2021;
originally announced October 2021.
-
MHVAE: a Human-Inspired Deep Hierarchical Generative Model for Multimodal Representation Learning
Authors:
Miguel Vasco,
Francisco S. Melo,
Ana Paiva
Abstract:
Humans are able to create rich representations of their external reality. Their internal representations allow for cross-modality inference, where available perceptions can induce the perceptual experience of missing input modalities. In this paper, we contribute the Multimodal Hierarchical Variational Auto-encoder (MHVAE), a hierarchical multimodal generative model for representation learning. In…
▽ More
Humans are able to create rich representations of their external reality. Their internal representations allow for cross-modality inference, where available perceptions can induce the perceptual experience of missing input modalities. In this paper, we contribute the Multimodal Hierarchical Variational Auto-encoder (MHVAE), a hierarchical multimodal generative model for representation learning. Inspired by human cognitive models, the MHVAE is able to learn modality-specific distributions, of an arbitrary number of modalities, and a joint-modality distribution, responsible for cross-modality inference. We formally derive the model's evidence lower bound and propose a novel methodology to approximate the joint-modality posterior based on modality-specific representation dropout. We evaluate the MHVAE on standard multimodal datasets. Our model performs on par with other state-of-the-art generative models regarding joint-modality reconstruction from arbitrary input modalities and cross-modality inference.
△ Less
Submitted 4 June, 2020;
originally announced June 2020.
-
Concerning Quantum Identification Without Entanglement
Authors:
Carlos E. González-Guillén,
María Isabel González Vasco,
Floyd Johnson,
Ángel L. Pérez del Pozo
Abstract:
Identification schemes are interactive protocols typically involving two parties, a prover, who wants to provide evidence of his or her identity and a verifier, who checks the provided evidence and decide whether it comes or not from the intended prover. In this paper, we comment on a recent proposal for quantum identity authentication from Zawadzki, and give a concrete attack upholding theoretica…
▽ More
Identification schemes are interactive protocols typically involving two parties, a prover, who wants to provide evidence of his or her identity and a verifier, who checks the provided evidence and decide whether it comes or not from the intended prover. In this paper, we comment on a recent proposal for quantum identity authentication from Zawadzki, and give a concrete attack upholding theoretical impossibility results from Lo and Buhrman et al. More precisely, we show that using a simple strategyan adversary may indeed obtain non-negligible information on the shared identification secret. While the security of a quantum identity authentication scheme is not formally defined in [1], it is clear that such a definition should somehow imply that an external entity may gain no information on the shared identification scheme (even if he actively participates injecting messages in a protocol execution, which is not assumed in our attack strategy).
△ Less
Submitted 30 March, 2020; v1 submitted 26 March, 2020;
originally announced March 2020.
-
Playing Games in the Dark: An approach for cross-modality transfer in reinforcement learning
Authors:
Rui Silva,
Miguel Vasco,
Francisco S. Melo,
Ana Paiva,
Manuela Veloso
Abstract:
In this work we explore the use of latent representations obtained from multiple input sensory modalities (such as images or sounds) in allowing an agent to learn and exploit policies over different subsets of input modalities. We propose a three-stage architecture that allows a reinforcement learning agent trained over a given sensory modality, to execute its task on a different sensory modality-…
▽ More
In this work we explore the use of latent representations obtained from multiple input sensory modalities (such as images or sounds) in allowing an agent to learn and exploit policies over different subsets of input modalities. We propose a three-stage architecture that allows a reinforcement learning agent trained over a given sensory modality, to execute its task on a different sensory modality-for example, learning a visual policy over image inputs, and then execute such policy when only sound inputs are available. We show that the generalized policies achieve better out-of-the-box performance when compared to different baselines. Moreover, we show this holds in different OpenAI gym and video game environments, even when using different multimodal generative models and reinforcement learning algorithms.
△ Less
Submitted 28 November, 2019;
originally announced November 2019.
-
Learning multimodal representations for sample-efficient recognition of human actions
Authors:
Miguel Vasco,
Francisco S. Melo,
David Martins de Matos,
Ana Paiva,
Tetsunari Inamura
Abstract:
Humans interact in rich and diverse ways with the environment. However, the representation of such behavior by artificial agents is often limited. In this work we present \textit{motion concepts}, a novel multimodal representation of human actions in a household environment. A motion concept encompasses a probabilistic description of the kinematics of the action along with its contextual backgroun…
▽ More
Humans interact in rich and diverse ways with the environment. However, the representation of such behavior by artificial agents is often limited. In this work we present \textit{motion concepts}, a novel multimodal representation of human actions in a household environment. A motion concept encompasses a probabilistic description of the kinematics of the action along with its contextual background, namely the location and the objects held during the performance. Furthermore, we present Online Motion Concept Learning (OMCL), a new algorithm which learns novel motion concepts from action demonstrations and recognizes previously learned motion concepts. The algorithm is evaluated on a virtual-reality household environment with the presence of a human avatar. OMCL outperforms standard motion recognition algorithms on an one-shot recognition task, attesting to its potential for sample-efficient recognition of human actions.
△ Less
Submitted 6 March, 2019;
originally announced March 2019.