-
Adaptive Domain Modeling with Language Models: A Multi-Agent Approach to Task Planning
Authors:
Harisankar Babu,
Philipp Schillinger,
Tamim Asfour
Abstract:
We introduce TAPAS (Task-based Adaptation and Planning using AgentS), a multi-agent framework that integrates Large Language Models (LLMs) with symbolic planning to solve complex tasks without the need for manually defined environment models. TAPAS employs specialized LLM-based agents that collaboratively generate and adapt domain models, initial states, and goal specifications as needed using str…
▽ More
We introduce TAPAS (Task-based Adaptation and Planning using AgentS), a multi-agent framework that integrates Large Language Models (LLMs) with symbolic planning to solve complex tasks without the need for manually defined environment models. TAPAS employs specialized LLM-based agents that collaboratively generate and adapt domain models, initial states, and goal specifications as needed using structured tool-calling mechanisms. Through this tool-based interaction, downstream agents can request modifications from upstream agents, enabling adaptation to novel attributes and constraints without manual domain redefinition. A ReAct (Reason+Act)-style execution agent, coupled with natural language plan translation, bridges the gap between dynamically generated plans and real-world robot capabilities. TAPAS demonstrates strong performance in benchmark planning domains and in the VirtualHome simulated real-world environment.
△ Less
Submitted 30 June, 2025; v1 submitted 24 June, 2025;
originally announced June 2025.
-
Geometric Contact Flows: Contactomorphisms for Dynamics and Control
Authors:
Andrea Testa,
Søren Hauberg,
Tamim Asfour,
Leonel Rozo
Abstract:
Accurately modeling and predicting complex dynamical systems, particularly those involving force exchange and dissipation, is crucial for applications ranging from fluid dynamics to robotics, but presents significant challenges due to the intricate interplay of geometric constraints and energy transfer. This paper introduces Geometric Contact Flows (GFC), a novel framework leveraging Riemannian an…
▽ More
Accurately modeling and predicting complex dynamical systems, particularly those involving force exchange and dissipation, is crucial for applications ranging from fluid dynamics to robotics, but presents significant challenges due to the intricate interplay of geometric constraints and energy transfer. This paper introduces Geometric Contact Flows (GFC), a novel framework leveraging Riemannian and Contact geometry as inductive biases to learn such systems. GCF constructs a latent contact Hamiltonian model encoding desirable properties like stability or energy conservation. An ensemble of contactomorphisms then adapts this model to the target dynamics while preserving these properties. This ensemble allows for uncertainty-aware geodesics that attract the system's behavior toward the data support, enabling robust generalization and adaptation to unseen scenarios. Experiments on learning dynamics for physical systems and for controlling robots on interaction tasks demonstrate the effectiveness of our approach.
△ Less
Submitted 21 June, 2025;
originally announced June 2025.
-
Safe Reinforcement Learning of Robot Trajectories in the Presence of Moving Obstacles
Authors:
Jonas Kiemel,
Ludovic Righetti,
Torsten Kröger,
Tamim Asfour
Abstract:
In this paper, we present an approach for learning collision-free robot trajectories in the presence of moving obstacles. As a first step, we train a backup policy to generate evasive movements from arbitrary initial robot states using model-free reinforcement learning. When learning policies for other tasks, the backup policy can be used to estimate the potential risk of a collision and to offer…
▽ More
In this paper, we present an approach for learning collision-free robot trajectories in the presence of moving obstacles. As a first step, we train a backup policy to generate evasive movements from arbitrary initial robot states using model-free reinforcement learning. When learning policies for other tasks, the backup policy can be used to estimate the potential risk of a collision and to offer an alternative action if the estimated risk is considered too high. No matter which action is selected, our action space ensures that the kinematic limits of the robot joints are not violated. We analyze and evaluate two different methods for estimating the risk of a collision. A physics simulation performed in the background is computationally expensive but provides the best results in deterministic environments. If a data-based risk estimator is used instead, the computational effort is significantly reduced, but an additional source of error is introduced. For evaluation, we successfully learn a reaching task and a basketball task while keeping the risk of collisions low. The results demonstrate the effectiveness of our approach for deterministic and stochastic environments, including a human-robot scenario and a ball environment, where no state can be considered permanently safe. By conducting experiments with a real robot, we show that our approach can generate safe trajectories in real time.
△ Less
Submitted 8 November, 2024;
originally announced November 2024.
-
On Probabilistic Pullback Metrics for Latent Hyperbolic Manifolds
Authors:
Luis Augenstein,
Noémie Jaquier,
Tamim Asfour,
Leonel Rozo
Abstract:
Probabilistic Latent Variable Models (LVMs) excel at modeling complex, high-dimensional data through lower-dimensional representations. Recent advances show that equipping these latent representations with a Riemannian metric unlocks geometry-aware distances and shortest paths that comply with the underlying data structure. This paper focuses on hyperbolic embeddings, a particularly suitable choic…
▽ More
Probabilistic Latent Variable Models (LVMs) excel at modeling complex, high-dimensional data through lower-dimensional representations. Recent advances show that equipping these latent representations with a Riemannian metric unlocks geometry-aware distances and shortest paths that comply with the underlying data structure. This paper focuses on hyperbolic embeddings, a particularly suitable choice for modeling hierarchical relationships. Previous approaches relying on hyperbolic geodesics for interpolating the latent space often generate paths crossing low-data regions, leading to highly uncertain predictions. Instead, we propose augmenting the hyperbolic manifold with a pullback metric to account for distortions introduced by the LVM's nonlinear mapping and provide a complete development for pullback metrics of Gaussian Process LVMs (GPLVMs). Our experiments demonstrate that geodesics on the pullback metric not only respect the geometry of the hyperbolic latent space but also align with the underlying data distribution, significantly reducing uncertainty in predictions.
△ Less
Submitted 18 May, 2025; v1 submitted 28 October, 2024;
originally announced October 2024.
-
A Riemannian Framework for Learning Reduced-order Lagrangian Dynamics
Authors:
Katharina Friedl,
Noémie Jaquier,
Jens Lundell,
Tamim Asfour,
Danica Kragic
Abstract:
By incorporating physical consistency as inductive bias, deep neural networks display increased generalization capabilities and data efficiency in learning nonlinear dynamic models. However, the complexity of these models generally increases with the system dimensionality, requiring larger datasets, more complex deep networks, and significant computational effort. We propose a novel geometric netw…
▽ More
By incorporating physical consistency as inductive bias, deep neural networks display increased generalization capabilities and data efficiency in learning nonlinear dynamic models. However, the complexity of these models generally increases with the system dimensionality, requiring larger datasets, more complex deep networks, and significant computational effort. We propose a novel geometric network architecture to learn physically-consistent reduced-order dynamic parameters that accurately describe the original high-dimensional system behavior. This is achieved by building on recent advances in model-order reduction and by adopting a Riemannian perspective to jointly learn a non-linear structure-preserving latent space and the associated low-dimensional dynamics. Our approach enables accurate long-term predictions of the high-dimensional dynamics of rigid and deformable systems with increased data efficiency by inferring interpretable and physically-plausible reduced Lagrangian models.
△ Less
Submitted 28 February, 2025; v1 submitted 24 October, 2024;
originally announced October 2024.
-
Learning Spatial Bimanual Action Models Based on Affordance Regions and Human Demonstrations
Authors:
Björn S. Plonka,
Christian Dreher,
Andre Meixner,
Rainer Kartmann,
Tamim Asfour
Abstract:
In this paper, we present a novel approach for learning bimanual manipulation actions from human demonstration by extracting spatial constraints between affordance regions, termed affordance constraints, of the objects involved. Affordance regions are defined as object parts that provide interaction possibilities to an agent. For example, the bottom of a bottle affords the object to be placed on a…
▽ More
In this paper, we present a novel approach for learning bimanual manipulation actions from human demonstration by extracting spatial constraints between affordance regions, termed affordance constraints, of the objects involved. Affordance regions are defined as object parts that provide interaction possibilities to an agent. For example, the bottom of a bottle affords the object to be placed on a surface, while its spout affords the contained liquid to be poured. We propose a novel approach to learn changes of affordance constraints in human demonstration to construct spatial bimanual action models representing object interactions. To exploit the information encoded in these spatial bimanual action models, we formulate an optimization problem to determine optimal object configurations across multiple execution keypoints while taking into account the initial scene, the learned affordance constraints, and the robot's kinematics. We evaluate the approach in simulation with two example tasks (pouring drinks and rolling dough) and compare three different definitions of affordance constraints: (i) component-wise distances between affordance regions in Cartesian space, (ii) component-wise distances between affordance regions in cylindrical space, and (iii) degrees of satisfaction of manually defined symbolic spatial affordance constraints.
△ Less
Submitted 18 November, 2024; v1 submitted 11 October, 2024;
originally announced October 2024.
-
Episodic Memory Verbalization using Hierarchical Representations of Life-Long Robot Experience
Authors:
Leonard Bärmann,
Chad DeChant,
Joana Plewnia,
Fabian Peller-Konrad,
Daniel Bauer,
Tamim Asfour,
Alex Waibel
Abstract:
Verbalization of robot experience, i.e., summarization of and question answering about a robot's past, is a crucial ability for improving human-robot interaction. Previous works applied rule-based systems or fine-tuned deep models to verbalize short (several-minute-long) streams of episodic data, limiting generalization and transferability. In our work, we apply large pretrained models to tackle t…
▽ More
Verbalization of robot experience, i.e., summarization of and question answering about a robot's past, is a crucial ability for improving human-robot interaction. Previous works applied rule-based systems or fine-tuned deep models to verbalize short (several-minute-long) streams of episodic data, limiting generalization and transferability. In our work, we apply large pretrained models to tackle this task with zero or few examples, and specifically focus on verbalizing life-long experiences. For this, we derive a tree-like data structure from episodic memory (EM), with lower levels representing raw perception and proprioception data, and higher levels abstracting events to natural language concepts. Given such a hierarchical representation built from the experience stream, we apply a large language model as an agent to interactively search the EM given a user's query, dynamically expanding (initially collapsed) tree nodes to find the relevant information. The approach keeps computational costs low even when scaling to months of robot experience data. We evaluate our method on simulated household robot data, human egocentric videos, and real-world robot recordings, demonstrating its flexibility and scalability.
△ Less
Submitted 26 September, 2024;
originally announced September 2024.
-
Force Myography based Torque Estimation in Human Knee and Ankle Joints
Authors:
Charlotte Marquardt,
Arne Schulz,
Miha Dezman,
Gunther Kurz,
Thorsten Stein,
Tamim Asfour
Abstract:
Online adaptation of exoskeleton control based on muscle activity sensing is a promising way to personalize exoskeletons based on the user's biosignals. While several electromyography (EMG) based methods have been shown to improve joint torque estimation, EMG sensors require direct skin contact and complex post-processing. In contrast, force myography (FMG) measures normal forces from changes in m…
▽ More
Online adaptation of exoskeleton control based on muscle activity sensing is a promising way to personalize exoskeletons based on the user's biosignals. While several electromyography (EMG) based methods have been shown to improve joint torque estimation, EMG sensors require direct skin contact and complex post-processing. In contrast, force myography (FMG) measures normal forces from changes in muscle volume due to muscle activity. We propose an FMG-based method to estimate knee and ankle joint torques by combining joint angles and velocities with muscle activity information. We learn a model for joint torque estimation using Gaussian process regression (GPR). The effectiveness of the proposed FMG-based method is validated on isokinetic motions performed by two subjects. The model is compared to a baseline model using only joint angle and velocity, as well as a model augmented by EMG data. The results show that integrating FMG into exoskeleton control improves the joint torque estimation for the ankle and knee and is therefore a promising way to improve adaptability to different exoskeleton users.
△ Less
Submitted 17 September, 2024;
originally announced September 2024.
-
PerAct2: Benchmarking and Learning for Robotic Bimanual Manipulation Tasks
Authors:
Markus Grotz,
Mohit Shridhar,
Tamim Asfour,
Dieter Fox
Abstract:
Bimanual manipulation is challenging due to precise spatial and temporal coordination required between two arms. While there exist several real-world bimanual systems, there is a lack of simulated benchmarks with a large task diversity for systematically studying bimanual capabilities across a wide range of tabletop tasks. This paper addresses the gap by extending RLBench to bimanual manipulation.…
▽ More
Bimanual manipulation is challenging due to precise spatial and temporal coordination required between two arms. While there exist several real-world bimanual systems, there is a lack of simulated benchmarks with a large task diversity for systematically studying bimanual capabilities across a wide range of tabletop tasks. This paper addresses the gap by extending RLBench to bimanual manipulation. We open-source our code and benchmark comprising 13 new tasks with 23 unique task variations, each requiring a high degree of coordination and adaptability. To kickstart the benchmark, we extended several state-of-the art methods to bimanual manipulation and also present a language-conditioned behavioral cloning agent -- PerAct2, which enables the learning and execution of bimanual 6-DoF manipulation tasks. Our novel network architecture efficiently integrates language processing with action prediction, allowing robots to understand and perform complex bimanual tasks in response to user-specified goals. Project website with code is available at: http://bimanual.github.io
△ Less
Submitted 31 July, 2024; v1 submitted 28 June, 2024;
originally announced July 2024.
-
SciEx: Benchmarking Large Language Models on Scientific Exams with Human Expert Grading and Automatic Grading
Authors:
Tu Anh Dinh,
Carlos Mullov,
Leonard Bärmann,
Zhaolin Li,
Danni Liu,
Simon Reiß,
Jueun Lee,
Nathan Lerzer,
Fabian Ternava,
Jianfeng Gao,
Tobias Röddiger,
Alexander Waibel,
Tamim Asfour,
Michael Beigl,
Rainer Stiefelhagen,
Carsten Dachsbacher,
Klemens Böhm,
Jan Niehues
Abstract:
With the rapid development of Large Language Models (LLMs), it is crucial to have benchmarks which can evaluate the ability of LLMs on different domains. One common use of LLMs is performing tasks on scientific topics, such as writing algorithms, querying databases or giving mathematical proofs. Inspired by the way university students are evaluated on such tasks, in this paper, we propose SciEx -…
▽ More
With the rapid development of Large Language Models (LLMs), it is crucial to have benchmarks which can evaluate the ability of LLMs on different domains. One common use of LLMs is performing tasks on scientific topics, such as writing algorithms, querying databases or giving mathematical proofs. Inspired by the way university students are evaluated on such tasks, in this paper, we propose SciEx - a benchmark consisting of university computer science exam questions, to evaluate LLMs ability on solving scientific tasks. SciEx is (1) multilingual, containing both English and German exams, and (2) multi-modal, containing questions that involve images, and (3) contains various types of freeform questions with different difficulty levels, due to the nature of university exams. We evaluate the performance of various state-of-the-art LLMs on our new benchmark. Since SciEx questions are freeform, it is not straightforward to evaluate LLM performance. Therefore, we provide human expert grading of the LLM outputs on SciEx. We show that the free-form exams in SciEx remain challenging for the current LLMs, where the best LLM only achieves 59.4\% exam grade on average. We also provide detailed comparisons between LLM performance and student performance on SciEx. To enable future evaluation of new LLMs, we propose using LLM-as-a-judge to grade the LLM answers on SciEx. Our experiments show that, although they do not perform perfectly on solving the exams, LLMs are decent as graders, achieving 0.948 Pearson correlation with expert grading.
△ Less
Submitted 2 October, 2024; v1 submitted 14 June, 2024;
originally announced June 2024.
-
Influence of Motion Restrictions in an Ankle Exoskeleton on Gait Kinematics and Stability in Straight Walking
Authors:
Miha Dezman,
Charlotte Marquardt,
Adnan Ugur,
Tamim Asfour
Abstract:
Exoskeleton devices impose kinematic constraints on a user's motion and affect their stability due to added mass but also due to the simplified mechanical design. This paper investigates how these constraints resulting from simplified mechanical designs impact the gait kinematics and stability of users by wearing an ankle exoskeleton with changeable degree of freedom (DoF). The exoskeleton used in…
▽ More
Exoskeleton devices impose kinematic constraints on a user's motion and affect their stability due to added mass but also due to the simplified mechanical design. This paper investigates how these constraints resulting from simplified mechanical designs impact the gait kinematics and stability of users by wearing an ankle exoskeleton with changeable degree of freedom (DoF). The exoskeleton used in this paper allows one, two, or three DoF at the ankle, simulating different levels of mechanical complexity. This effect was evaluated in a pilot study consisting of six participants walking on a straight path. The results show that increasing the exoskeleton DoF results in an improvement of several metrics, including kinematics and gait parameters. The transition from 1 DoF to 2 DoF is shown to have a larger effect than the transition from 2 DoF to 3 DoF for an ankle exoskeleton. However, an exoskeleton with 3 DoF at the ankle featured the best results. Increasing the number of DoF resulted in stability values closer the values when walking without the exoskeleton, despite the added weight of the exoskeleton.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Learning Symbolic and Subsymbolic Temporal Task Constraints from Bimanual Human Demonstrations
Authors:
Christian Dreher,
Tamim Asfour
Abstract:
Learning task models of bimanual manipulation from human demonstration and their execution on a robot should take temporal constraints between actions into account. This includes constraints on (i) the symbolic level such as precedence relations or temporal overlap in the execution, and (ii) the subsymbolic level such as the duration of different actions, or their starting and end points in time.…
▽ More
Learning task models of bimanual manipulation from human demonstration and their execution on a robot should take temporal constraints between actions into account. This includes constraints on (i) the symbolic level such as precedence relations or temporal overlap in the execution, and (ii) the subsymbolic level such as the duration of different actions, or their starting and end points in time. Such temporal constraints are crucial for temporal planning, reasoning, and the exact timing for the execution of bimanual actions on a bimanual robot. In our previous work, we addressed the learning of temporal task constraints on the symbolic level and demonstrated how a robot can leverage this knowledge to respond to failures during execution. In this work, we propose a novel model-driven approach for the combined learning of symbolic and subsymbolic temporal task constraints from multiple bimanual human demonstrations. Our main contributions are a subsymbolic foundation of a temporal task model that describes temporal nexuses of actions in the task based on distributions of temporal differences between semantic action keypoints, as well as a method based on fuzzy logic to derive symbolic temporal task constraints from this representation. This complements our previous work on learning comprehensive temporal task models by integrating symbolic and subsymbolic information based on a subsymbolic foundation, while still maintaining the symbolic expressiveness of our previous approach. We compare our proposed approach with our previous pure-symbolic approach and show that we can reproduce and even outperform it. Additionally, we show how the subsymbolic temporal task constraints can synchronize otherwise unimanual movement primitives for bimanual behavior on a humanoid robot.
△ Less
Submitted 3 September, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
KITchen: A Real-World Benchmark and Dataset for 6D Object Pose Estimation in Kitchen Environments
Authors:
Abdelrahman Younes,
Tamim Asfour
Abstract:
Despite the recent progress on 6D object pose estimation methods for robotic grasping, a substantial performance gap persists between the capabilities of these methods on existing datasets and their efficacy in real-world grasping and mobile manipulation tasks, particularly when robots rely solely on their monocular egocentric field of view (FOV). Existing real-world datasets primarily focus on ta…
▽ More
Despite the recent progress on 6D object pose estimation methods for robotic grasping, a substantial performance gap persists between the capabilities of these methods on existing datasets and their efficacy in real-world grasping and mobile manipulation tasks, particularly when robots rely solely on their monocular egocentric field of view (FOV). Existing real-world datasets primarily focus on table-top grasping scenarios, where a robot arm is placed in a fixed position and the objects are centralized within the FOV of fixed external camera(s). Assessing performance on such datasets may not accurately reflect the challenges encountered in everyday grasping and mobile manipulation tasks within kitchen environments such as retrieving objects from higher shelves, sinks, dishwashers, ovens, refrigerators, or microwaves. To address this gap, we present KITchen, a novel benchmark designed specifically for estimating the 6D poses of objects located in diverse positions within kitchen settings. For this purpose, we recorded a comprehensive dataset comprising around 205k real-world RGBD images for 111 kitchen objects captured in two distinct kitchens, utilizing a humanoid robot with its egocentric perspectives. Subsequently, we developed a semi-automated annotation pipeline, to streamline the labeling process of such datasets, resulting in the generation of 2D object labels, 2D object segmentation masks, and 6D object poses with minimal human effort. The benchmark, the dataset, and the annotation pipeline will be publicly available at https://kitchen-dataset.github.io/KITchen.
△ Less
Submitted 17 December, 2024; v1 submitted 24 March, 2024;
originally announced March 2024.
-
Visual Imitation Learning of Task-Oriented Object Grasping and Rearrangement
Authors:
Yichen Cai,
Jianfeng Gao,
Christoph Pohl,
Tamim Asfour
Abstract:
Task-oriented object grasping and rearrangement are critical skills for robots to accomplish different real-world manipulation tasks. However, they remain challenging due to partial observations of the objects and shape variations in categorical objects. In this paper, we propose the Multi-feature Implicit Model (MIMO), a novel object representation that encodes multiple spatial features between a…
▽ More
Task-oriented object grasping and rearrangement are critical skills for robots to accomplish different real-world manipulation tasks. However, they remain challenging due to partial observations of the objects and shape variations in categorical objects. In this paper, we propose the Multi-feature Implicit Model (MIMO), a novel object representation that encodes multiple spatial features between a point and an object in an implicit neural field. Training such a model on multiple features ensures that it embeds the object shapes consistently in different aspects, thus improving its performance in object shape reconstruction from partial observation, shape similarity measure, and modeling spatial relations between objects. Based on MIMO, we propose a framework to learn task-oriented object grasping and rearrangement from single or multiple human demonstration videos. The evaluations in simulation show that our approach outperforms the state-of-the-art methods for multi- and single-view observations. Real-world experiments demonstrate the efficacy of our approach in one- and few-shot imitation learning of manipulation tasks.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Riemannian Flow Matching Policy for Robot Motion Learning
Authors:
Max Braun,
Noémie Jaquier,
Leonel Rozo,
Tamim Asfour
Abstract:
We introduce Riemannian Flow Matching Policies (RFMP), a novel model for learning and synthesizing robot visuomotor policies. RFMP leverages the efficient training and inference capabilities of flow matching methods. By design, RFMP inherits the strengths of flow matching: the ability to encode high-dimensional multimodal distributions, commonly encountered in robotic tasks, and a very simple and…
▽ More
We introduce Riemannian Flow Matching Policies (RFMP), a novel model for learning and synthesizing robot visuomotor policies. RFMP leverages the efficient training and inference capabilities of flow matching methods. By design, RFMP inherits the strengths of flow matching: the ability to encode high-dimensional multimodal distributions, commonly encountered in robotic tasks, and a very simple and fast inference process. We demonstrate the applicability of RFMP to both state-based and vision-conditioned robot motion policies. Notably, as the robot state resides on a Riemannian manifold, RFMP inherently incorporates geometric awareness, which is crucial for realistic robotic tasks. To evaluate RFMP, we conduct two proof-of-concept experiments, comparing its performance against Diffusion Policies. Although both approaches successfully learn the considered tasks, our results show that RFMP provides smoother action trajectories with significantly lower inference times.
△ Less
Submitted 27 August, 2024; v1 submitted 15 March, 2024;
originally announced March 2024.
-
Bi-KVIL: Keypoints-based Visual Imitation Learning of Bimanual Manipulation Tasks
Authors:
Jianfeng Gao,
Xiaoshu Jin,
Franziska Krebs,
Noémie Jaquier,
Tamim Asfour
Abstract:
Visual imitation learning has achieved impressive progress in learning unimanual manipulation tasks from a small set of visual observations, thanks to the latest advances in computer vision. However, learning bimanual coordination strategies and complex object relations from bimanual visual demonstrations, as well as generalizing them to categorical objects in novel cluttered scenes remain unsolve…
▽ More
Visual imitation learning has achieved impressive progress in learning unimanual manipulation tasks from a small set of visual observations, thanks to the latest advances in computer vision. However, learning bimanual coordination strategies and complex object relations from bimanual visual demonstrations, as well as generalizing them to categorical objects in novel cluttered scenes remain unsolved challenges. In this paper, we extend our previous work on keypoints-based visual imitation learning (\mbox{K-VIL})~\cite{gao_kvil_2023} to bimanual manipulation tasks. The proposed Bi-KVIL jointly extracts so-called \emph{Hybrid Master-Slave Relationships} (HMSR) among objects and hands, bimanual coordination strategies, and sub-symbolic task representations. Our bimanual task representation is object-centric, embodiment-independent, and viewpoint-invariant, thus generalizing well to categorical objects in novel scenes. We evaluate our approach in various real-world applications, showcasing its ability to learn fine-grained bimanual manipulation tasks from a small number of human demonstration videos. Videos and source code are available at https://sites.google.com/view/bi-kvil.
△ Less
Submitted 22 March, 2024; v1 submitted 5 March, 2024;
originally announced March 2024.
-
AutoGPT+P: Affordance-based Task Planning with Large Language Models
Authors:
Timo Birr,
Christoph Pohl,
Abdelrahman Younes,
Tamim Asfour
Abstract:
Recent advances in task planning leverage Large Language Models (LLMs) to improve generalizability by combining such models with classical planning algorithms to address their inherent limitations in reasoning capabilities. However, these approaches face the challenge of dynamically capturing the initial state of the task planning problem. To alleviate this issue, we propose AutoGPT+P, a system th…
▽ More
Recent advances in task planning leverage Large Language Models (LLMs) to improve generalizability by combining such models with classical planning algorithms to address their inherent limitations in reasoning capabilities. However, these approaches face the challenge of dynamically capturing the initial state of the task planning problem. To alleviate this issue, we propose AutoGPT+P, a system that combines an affordance-based scene representation with a planning system. Affordances encompass the action possibilities of an agent on the environment and objects present in it. Thus, deriving the planning domain from an affordance-based scene representation allows symbolic planning with arbitrary objects. AutoGPT+P leverages this representation to derive and execute a plan for a task specified by the user in natural language. In addition to solving planning tasks under a closed-world assumption, AutoGPT+P can also handle planning with incomplete information, e. g., tasks with missing objects by exploring the scene, suggesting alternatives, or providing a partial plan. The affordance-based scene representation combines object detection with an automatically generated object-affordance-mapping using ChatGPT. The core planning tool extends existing work by automatically correcting semantic and syntactic errors. Our approach achieves a success rate of 98%, surpassing the current 81% success rate of the current state-of-the-art LLM-based planning method SayCan on the SayCan instruction set. Furthermore, we evaluated our approach on our newly created dataset with 150 scenarios covering a wide range of complex tasks with missing objects, achieving a success rate of 79% on our dataset. The dataset and the code are publicly available at https://git.h2t.iar.kit.edu/birr/autogpt-p-standalone.
△ Less
Submitted 23 July, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
MAkEable: Memory-centered and Affordance-based Task Execution Framework for Transferable Mobile Manipulation Skills
Authors:
Christoph Pohl,
Fabian Reister,
Fabian Peller-Konrad,
Tamim Asfour
Abstract:
To perform versatile mobile manipulation tasks in human-centered environments, the ability to efficiently transfer learned tasks and experiences from one robot to another or across different environments is key. In this paper, we present MAkEable, a versatile uni- and multi-manual mobile manipulation framework that facilitates the transfer of capabilities and knowledge across different tasks, envi…
▽ More
To perform versatile mobile manipulation tasks in human-centered environments, the ability to efficiently transfer learned tasks and experiences from one robot to another or across different environments is key. In this paper, we present MAkEable, a versatile uni- and multi-manual mobile manipulation framework that facilitates the transfer of capabilities and knowledge across different tasks, environments, and robots. Our framework integrates an affordance-based task description into the memory-centric cognitive architecture of the ARMAR humanoid robot family, which supports the sharing of experiences and demonstrations for transfer learning. By representing mobile manipulation actions through affordances, i.e., interaction possibilities of the robot with its environment, we provide a unifying framework for the autonomous uni- and multi-manual manipulation of known and unknown objects in various environments. We demonstrate the applicability of the framework in real-world experiments for multiple robots, tasks, and environments. This includes grasping known and unknown objects, object placing, bimanual object grasping, memory-enabled skill transfer in a drawer opening scenario across two different humanoid robots, and a pouring task learned from human demonstration.
△ Less
Submitted 21 March, 2024; v1 submitted 30 January, 2024;
originally announced January 2024.
-
How to Raise a Robot -- A Case for Neuro-Symbolic AI in Constrained Task Planning for Humanoid Assistive Robots
Authors:
Niklas Hemken,
Florian Jacob,
Fabian Peller-Konrad,
Rainer Kartmann,
Tamim Asfour,
Hannes Hartenstein
Abstract:
Humanoid robots will be able to assist humans in their daily life, in particular due to their versatile action capabilities. However, while these robots need a certain degree of autonomy to learn and explore, they also should respect various constraints, for access control and beyond. We explore the novel field of incorporating privacy, security, and access control constraints with robot task plan…
▽ More
Humanoid robots will be able to assist humans in their daily life, in particular due to their versatile action capabilities. However, while these robots need a certain degree of autonomy to learn and explore, they also should respect various constraints, for access control and beyond. We explore the novel field of incorporating privacy, security, and access control constraints with robot task planning approaches. We report preliminary results on the classical symbolic approach, deep-learned neural networks, and modern ideas using large language models as knowledge base. From analyzing their trade-offs, we conclude that a hybrid approach is necessary, and thereby present a new use case for the emerging field of neuro-symbolic artificial intelligence.
△ Less
Submitted 27 December, 2023; v1 submitted 14 December, 2023;
originally announced December 2023.
-
Incremental Learning of Full-Pose Via-Point Movement Primitives on Riemannian Manifolds
Authors:
Tilman Daab,
Noémie Jaquier,
Christian Dreher,
Andre Meixner,
Franziska Krebs,
Tamim Asfour
Abstract:
Movement primitives (MPs) are compact representations of robot skills that can be learned from demonstrations and combined into complex behaviors. However, merely equipping robots with a fixed set of innate MPs is insufficient to deploy them in dynamic and unpredictable environments. Instead, the full potential of MPs remains to be attained via adaptable, large-scale MP libraries. In this paper, w…
▽ More
Movement primitives (MPs) are compact representations of robot skills that can be learned from demonstrations and combined into complex behaviors. However, merely equipping robots with a fixed set of innate MPs is insufficient to deploy them in dynamic and unpredictable environments. Instead, the full potential of MPs remains to be attained via adaptable, large-scale MP libraries. In this paper, we propose a set of seven fundamental operations to incrementally learn, improve, and re-organize MP libraries. To showcase their applicability, we provide explicit formulations of the spatial operations for libraries composed of Via-Point Movement Primitives (VMPs). By building on Riemannian manifold theory, our approach enables the incremental learning of all parameters of position and orientation VMPs within a library. Moreover, our approach stores a fixed number of parameters, thus complying with the essential principles of incremental learning. We evaluate our approach to incrementally learn a VMP library from motion capture data provided sequentially.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Transfer Learning in Robotics: An Upcoming Breakthrough? A Review of Promises and Challenges
Authors:
Noémie Jaquier,
Michael C. Welle,
Andrej Gams,
Kunpeng Yao,
Bernardo Fichera,
Aude Billard,
Aleš Ude,
Tamim Asfour,
Danica Kragic
Abstract:
Transfer learning is a conceptually-enticing paradigm in pursuit of truly intelligent embodied agents. The core concept -- reusing prior knowledge to learn in and from novel situations -- is successfully leveraged by humans to handle novel situations. In recent years, transfer learning has received renewed interest from the community from different perspectives, including imitation learning, domai…
▽ More
Transfer learning is a conceptually-enticing paradigm in pursuit of truly intelligent embodied agents. The core concept -- reusing prior knowledge to learn in and from novel situations -- is successfully leveraged by humans to handle novel situations. In recent years, transfer learning has received renewed interest from the community from different perspectives, including imitation learning, domain adaptation, and transfer of experience from simulation to the real world, among others. In this paper, we unify the concept of transfer learning in robotics and provide the first taxonomy of its kind considering the key concepts of robot, task, and environment. Through a review of the promises and challenges in the field, we identify the need of transferring at different abstraction levels, the need of quantifying the transfer gap and the quality of transfer, as well as the dangers of negative transfer. Via this position paper, we hope to channel the effort of the community towards the most significant roadblocks to realize the full potential of transfer learning in robotics.
△ Less
Submitted 2 May, 2024; v1 submitted 29 November, 2023;
originally announced November 2023.
-
Reinforcement Learning for Safety Testing: Lessons from A Mobile Robot Case Study
Authors:
Tom P. Huck,
Martin Kaiser,
Constantin Cronrath,
Bengt Lennartson,
Torsten Kröger,
Tamim Asfour
Abstract:
Safety-critical robot systems need thorough testing to expose design flaws and software bugs which could endanger humans. Testing in simulation is becoming increasingly popular, as it can be applied early in the development process and does not endanger any real-world operators. However, not all safety-critical flaws become immediately observable in simulation. Some may only become observable unde…
▽ More
Safety-critical robot systems need thorough testing to expose design flaws and software bugs which could endanger humans. Testing in simulation is becoming increasingly popular, as it can be applied early in the development process and does not endanger any real-world operators. However, not all safety-critical flaws become immediately observable in simulation. Some may only become observable under certain critical conditions. If these conditions are not covered, safety flaws may remain undetected. Creating critical tests is therefore crucial. In recent years, there has been a trend towards using Reinforcement Learning (RL) for this purpose. Guided by domain-specific reward functions, RL algorithms are used to learn critical test strategies. This paper presents a case study in which the collision avoidance behavior of a mobile robot is subjected to RL-based testing. The study confirms prior research which shows that RL can be an effective testing tool. However, the study also highlights certain challenges associated with RL-based testing, namely (i) a possible lack of diversity in test conditions and (ii) the phenomenon of reward hacking where the RL agent behaves in undesired ways due to a misalignment of reward and test specification. The challenges are illustrated with data and examples from the experiments, and possible mitigation strategies are discussed.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Unraveling the Single Tangent Space Fallacy: An Analysis and Clarification for Applying Riemannian Geometry in Robot Learning
Authors:
Noémie Jaquier,
Leonel Rozo,
Tamim Asfour
Abstract:
In the realm of robotics, numerous downstream robotics tasks leverage machine learning methods for processing, modeling, or synthesizing data. Often, this data comprises variables that inherently carry geometric constraints, such as the unit-norm condition of quaternions representing rigid-body orientations or the positive definiteness of stiffness and manipulability ellipsoids. Handling such geom…
▽ More
In the realm of robotics, numerous downstream robotics tasks leverage machine learning methods for processing, modeling, or synthesizing data. Often, this data comprises variables that inherently carry geometric constraints, such as the unit-norm condition of quaternions representing rigid-body orientations or the positive definiteness of stiffness and manipulability ellipsoids. Handling such geometric constraints effectively requires the incorporation of tools from differential geometry into the formulation of machine learning methods. In this context, Riemannian manifolds emerge as a powerful mathematical framework to handle such geometric constraints. Nevertheless, their recent adoption in robot learning has been largely characterized by a mathematically-flawed simplification, hereinafter referred to as the "single tangent space fallacy". This approach involves merely projecting the data of interest onto a single tangent (Euclidean) space, over which an off-the-shelf learning algorithm is applied. This paper provides a theoretical elucidation of various misconceptions surrounding this approach and offers experimental evidence of its shortcomings. Finally, it presents valuable insights to promote best practices when employing Riemannian geometry within robot learning applications.
△ Less
Submitted 29 April, 2024; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Incremental Learning of Humanoid Robot Behavior from Natural Interaction and Large Language Models
Authors:
Leonard Bärmann,
Rainer Kartmann,
Fabian Peller-Konrad,
Jan Niehues,
Alex Waibel,
Tamim Asfour
Abstract:
Natural-language dialog is key for intuitive human-robot interaction. It can be used not only to express humans' intents, but also to communicate instructions for improvement if a robot does not understand a command correctly. Of great importance is to endow robots with the ability to learn from such interaction experience in an incremental way to allow them to improve their behaviors or avoid mis…
▽ More
Natural-language dialog is key for intuitive human-robot interaction. It can be used not only to express humans' intents, but also to communicate instructions for improvement if a robot does not understand a command correctly. Of great importance is to endow robots with the ability to learn from such interaction experience in an incremental way to allow them to improve their behaviors or avoid mistakes in the future. In this paper, we propose a system to achieve incremental learning of complex behavior from natural interaction, and demonstrate its implementation on a humanoid robot. Building on recent advances, we present a system that deploys Large Language Models (LLMs) for high-level orchestration of the robot's behavior, based on the idea of enabling the LLM to generate Python statements in an interactive console to invoke both robot perception and action. The interaction loop is closed by feeding back human instructions, environment observations, and execution results to the LLM, thus informing the generation of the next statement. Specifically, we introduce incremental prompt learning, which enables the system to interactively learn from its mistakes. For that purpose, the LLM can call another LLM responsible for code-level improvements of the current interaction based on human feedback. The improved interaction is then saved in the robot's memory, and thus retrieved on similar requests. We integrate the system in the robot cognitive architecture of the humanoid robot ARMAR-6 and evaluate our methods both quantitatively (in simulation) and qualitatively (in simulation and real-world) by demonstrating generalized incrementally-learned knowledge.
△ Less
Submitted 16 May, 2024; v1 submitted 8 September, 2023;
originally announced September 2023.
-
Uncertainty-aware Risk Assessment of Robotic Systems via Importance Sampling
Authors:
Woo-Jeong Baek,
Tom P. Huck,
Joschka Haas,
Jonas Lewandrowski,
Tamim Asfour,
Torsten Kröger
Abstract:
In this paper, we introduce a probabilistic approach to risk assessment of robot systems by focusing on the impact of uncertainties. While various approaches to identifying systematic hazards (e.g., bugs, design flaws, etc.) can be found in current literature, little attention has been devoted to evaluating risks in robot systems in a probabilistic manner. Existing methods rely on discrete notions…
▽ More
In this paper, we introduce a probabilistic approach to risk assessment of robot systems by focusing on the impact of uncertainties. While various approaches to identifying systematic hazards (e.g., bugs, design flaws, etc.) can be found in current literature, little attention has been devoted to evaluating risks in robot systems in a probabilistic manner. Existing methods rely on discrete notions for dangerous events and assume that the consequences of these can be described by simple logical operations. In this work, we consider measurement uncertainties as one main contributor to the evolvement of risks. Specifically, we study the impact of temporal and spatial uncertainties on the occurrence probability of dangerous failures, thereby deriving an approach for an uncertainty-aware risk assessment. Secondly, we introduce a method to improve the statistical significance of our results: While the rare occurrence of hazardous events makes it challenging to draw conclusions with reliable accuracy, we show that importance sampling -- a technique that successively generates samples in regions with sparse probability densities -- allows for overcoming this issue. We demonstrate the validity of our novel uncertainty-aware risk assessment method in three simulation scenarios from the domain of human-robot collaboration. Finally, we show how the results can be used to evaluate arbitrary safety limits of robot systems.
△ Less
Submitted 27 August, 2023;
originally announced August 2023.
-
On the Design of Region-Avoiding Metrics for Collision-Safe Motion Generation on Riemannian Manifolds
Authors:
Holger Klein,
Noémie Jaquier,
Andre Meixner,
Tamim Asfour
Abstract:
The generation of energy-efficient and dynamic-aware robot motions that satisfy constraints such as joint limits, self-collisions, and collisions with the environment remains a challenge. In this context, Riemannian geometry offers promising solutions by identifying robot motions with geodesics on the so-called configuration space manifold. While this manifold naturally considers the intrinsic rob…
▽ More
The generation of energy-efficient and dynamic-aware robot motions that satisfy constraints such as joint limits, self-collisions, and collisions with the environment remains a challenge. In this context, Riemannian geometry offers promising solutions by identifying robot motions with geodesics on the so-called configuration space manifold. While this manifold naturally considers the intrinsic robot dynamics, constraints such as joint limits, self-collisions, and collisions with the environment remain overlooked. In this paper, we propose a modification of the Riemannian metric of the configuration space manifold allowing for the generation of robot motions as geodesics that efficiently avoid given regions. We introduce a class of Riemannian metrics based on barrier functions that guarantee strict region avoidance by systematically generating accelerations away from no-go regions in joint and task space. We evaluate the proposed Riemannian metric to generate energy-efficient, dynamic-aware, and collision-free motions of a humanoid robot as geodesics and sequences thereof.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Interactive and Incremental Learning of Spatial Object Relations from Human Demonstrations
Authors:
Rainer Kartmann,
Tamim Asfour
Abstract:
Humans use semantic concepts such as spatial relations between objects to describe scenes and communicate tasks such as "Put the tea to the right of the cup" or "Move the plate between the fork and the spoon." Just as children, assistive robots must be able to learn the sub-symbolic meaning of such concepts from human demonstrations and instructions. We address the problem of incrementally learnin…
▽ More
Humans use semantic concepts such as spatial relations between objects to describe scenes and communicate tasks such as "Put the tea to the right of the cup" or "Move the plate between the fork and the spoon." Just as children, assistive robots must be able to learn the sub-symbolic meaning of such concepts from human demonstrations and instructions. We address the problem of incrementally learning geometric models of spatial relations from few demonstrations collected online during interaction with a human. Such models enable a robot to manipulate objects in order to fulfill desired spatial relations specified by verbal instructions. At the start, we assume the robot has no geometric model of spatial relations. Given a task as above, the robot requests the user to demonstrate the task once in order to create a model from a single demonstration, leveraging cylindrical probability distribution as generative representation of spatial relations. We show how this model can be updated incrementally with each new demonstration without access to past examples in a sample-efficient way using incremental maximum likelihood estimation, and demonstrate the approach on a real humanoid robot.
△ Less
Submitted 16 May, 2023;
originally announced May 2023.
-
Virtual Reality via Object Pose Estimation and Active Learning: Realizing Telepresence Robots with Aerial Manipulation Capabilities
Authors:
Jongseok Lee,
Ribin Balachandran,
Konstantin Kondak,
Andre Coelho,
Marco De Stefano,
Matthias Humt,
Jianxiang Feng,
Tamim Asfour,
Rudolph Triebel
Abstract:
This article presents a novel telepresence system for advancing aerial manipulation in dynamic and unstructured environments. The proposed system not only features a haptic device, but also a virtual reality (VR) interface that provides real-time 3D displays of the robot's workspace as well as a haptic guidance to its remotely located operator. To realize this, multiple sensors namely a LiDAR, cam…
▽ More
This article presents a novel telepresence system for advancing aerial manipulation in dynamic and unstructured environments. The proposed system not only features a haptic device, but also a virtual reality (VR) interface that provides real-time 3D displays of the robot's workspace as well as a haptic guidance to its remotely located operator. To realize this, multiple sensors namely a LiDAR, cameras and IMUs are utilized. For processing of the acquired sensory data, pose estimation pipelines are devised for industrial objects of both known and unknown geometries. We further propose an active learning pipeline in order to increase the sample efficiency of a pipeline component that relies on Deep Neural Networks (DNNs) based object detection. All these algorithms jointly address various challenges encountered during the execution of perception tasks in industrial scenarios. In the experiments, exhaustive ablation studies are provided to validate the proposed pipelines. Methodologically, these results commonly suggest how an awareness of the algorithms' own failures and uncertainty (`introspection') can be used tackle the encountered problems. Moreover, outdoor experiments are conducted to evaluate the effectiveness of the overall system in enhancing aerial manipulation capabilities. In particular, with flight campaigns over days and nights, from spring to winter, and with different users and locations, we demonstrate over 70 robust executions of pick-and-place, force application and peg-in-hole tasks with the DLR cable-Suspended Aerial Manipulator (SAM). As a result, we show the viability of the proposed system in future industrial applications.
△ Less
Submitted 10 February, 2023; v1 submitted 18 October, 2022;
originally announced October 2022.
-
Bringing motion taxonomies to continuous domains via GPLVM on hyperbolic manifolds
Authors:
Noémie Jaquier,
Leonel Rozo,
Miguel González-Duque,
Viacheslav Borovitskiy,
Tamim Asfour
Abstract:
Human motion taxonomies serve as high-level hierarchical abstractions that classify how humans move and interact with their environment. They have proven useful to analyse grasps, manipulation skills, and whole-body support poses. Despite substantial efforts devoted to design their hierarchy and underlying categories, their use remains limited. This may be attributed to the lack of computational m…
▽ More
Human motion taxonomies serve as high-level hierarchical abstractions that classify how humans move and interact with their environment. They have proven useful to analyse grasps, manipulation skills, and whole-body support poses. Despite substantial efforts devoted to design their hierarchy and underlying categories, their use remains limited. This may be attributed to the lack of computational models that fill the gap between the discrete hierarchical structure of the taxonomy and the high-dimensional heterogeneous data associated to its categories. To overcome this problem, we propose to model taxonomy data via hyperbolic embeddings that capture the associated hierarchical structure. We achieve this by formulating a novel Gaussian process hyperbolic latent variable model that incorporates the taxonomy structure through graph-based priors on the latent space and distance-preserving back constraints. We validate our model on three different human motion taxonomies to learn hyperbolic embeddings that faithfully preserve the original graph structure. We show that our model properly encodes unseen data from existing or new taxonomy categories, and outperforms its Euclidean and VAE-based counterparts. Finally, through proof-of-concept experiments, we show that our model may be used to generate realistic trajectories between the learned embeddings.
△ Less
Submitted 15 September, 2024; v1 submitted 4 October, 2022;
originally announced October 2022.
-
Riemannian geometry as a unifying theory for robot motion learning and control
Authors:
Noémie Jaquier,
Tamim Asfour
Abstract:
Riemannian geometry is a mathematical field which has been the cornerstone of revolutionary scientific discoveries such as the theory of general relativity. Despite early uses in robot design and recent applications for exploiting data with specific geometries, it mostly remains overlooked in robotics. With this blue sky paper, we argue that Riemannian geometry provides the most suitable tools to…
▽ More
Riemannian geometry is a mathematical field which has been the cornerstone of revolutionary scientific discoveries such as the theory of general relativity. Despite early uses in robot design and recent applications for exploiting data with specific geometries, it mostly remains overlooked in robotics. With this blue sky paper, we argue that Riemannian geometry provides the most suitable tools to analyze and generate well-coordinated, energy-efficient motions of robots with many degrees of freedom. Via preliminary solutions and novel research directions, we discuss how Riemannian geometry may be leveraged to design and combine physically-meaningful synergies for robotics, and how this theory also opens the door to coupling motion synergies with perceptual inputs.
△ Less
Submitted 30 September, 2022;
originally announced September 2022.
-
K-VIL: Keypoints-based Visual Imitation Learning
Authors:
Jianfeng Gao,
Zhi Tao,
Noémie Jaquier,
Tamim Asfour
Abstract:
Visual imitation learning provides efficient and intuitive solutions for robotic systems to acquire novel manipulation skills. However, simultaneously learning geometric task constraints and control policies from visual inputs alone remains a challenging problem. In this paper, we propose an approach for keypoint-based visual imitation (K-VIL) that automatically extracts sparse, object-centric, an…
▽ More
Visual imitation learning provides efficient and intuitive solutions for robotic systems to acquire novel manipulation skills. However, simultaneously learning geometric task constraints and control policies from visual inputs alone remains a challenging problem. In this paper, we propose an approach for keypoint-based visual imitation (K-VIL) that automatically extracts sparse, object-centric, and embodiment-independent task representations from a small number of human demonstration videos. The task representation is composed of keypoint-based geometric constraints on principal manifolds, their associated local frames, and the movement primitives that are then needed for the task execution. Our approach is capable of extracting such task representations from a single demonstration video, and of incrementally updating them when new demonstrations become available. To reproduce manipulation skills using the learned set of prioritized geometric constraints in novel scenes, we introduce a novel keypoint-based admittance controller. We evaluate our approach in several real-world applications, showcasing its ability to deal with cluttered scenes, viewpoint mismatch, new instances of categorical objects, and large object pose and shape variations, as well as its efficiency and robustness in both one-shot and few-shot imitation learning settings. Videos and source code are available at https://sites.google.com/view/k-vil.
△ Less
Submitted 25 July, 2023; v1 submitted 7 September, 2022;
originally announced September 2022.
-
SpeedFolding: Learning Efficient Bimanual Folding of Garments
Authors:
Yahav Avigal,
Lars Berscheid,
Tamim Asfour,
Torsten Kröger,
Ken Goldberg
Abstract:
Folding garments reliably and efficiently is a long standing challenge in robotic manipulation due to the complex dynamics and high dimensional configuration space of garments. An intuitive approach is to initially manipulate the garment to a canonical smooth configuration before folding. In this work, we develop SpeedFolding, a reliable and efficient bimanual system, which given user-defined inst…
▽ More
Folding garments reliably and efficiently is a long standing challenge in robotic manipulation due to the complex dynamics and high dimensional configuration space of garments. An intuitive approach is to initially manipulate the garment to a canonical smooth configuration before folding. In this work, we develop SpeedFolding, a reliable and efficient bimanual system, which given user-defined instructions as folding lines, manipulates an initially crumpled garment to (1) a smoothed and (2) a folded configuration. Our primary contribution is a novel neural network architecture that is able to predict pairs of gripper poses to parameterize a diverse set of bimanual action primitives. After learning from 4300 human-annotated and self-supervised actions, the robot is able to fold garments from a random initial configuration in under 120s on average with a success rate of 93%. Real-world experiments show that the system is able to generalize to unseen garments of different color, shape, and stiffness. While prior work achieved 3-6 Folds Per Hour (FPH), SpeedFolding achieves 30-40 FPH.
△ Less
Submitted 9 September, 2022; v1 submitted 22 August, 2022;
originally announced August 2022.
-
A Riemannian Take on Human Motion Analysis and Retargeting
Authors:
Holger Klein,
Noémie Jaquier,
Andre Meixner,
Tamim Asfour
Abstract:
Dynamic motions of humans and robots are widely driven by posture-dependent nonlinear interactions between their degrees of freedom. However, these dynamical effects remain mostly overlooked when studying the mechanisms of human movement generation. Inspired by recent works, we hypothesize that human motions are planned as sequences of geodesic synergies, and thus correspond to coordinated joint m…
▽ More
Dynamic motions of humans and robots are widely driven by posture-dependent nonlinear interactions between their degrees of freedom. However, these dynamical effects remain mostly overlooked when studying the mechanisms of human movement generation. Inspired by recent works, we hypothesize that human motions are planned as sequences of geodesic synergies, and thus correspond to coordinated joint movements achieved with piecewise minimum energy. The underlying computational model is built on Riemannian geometry to account for the inertial characteristics of the body. Through the analysis of various human arm motions, we find that our model segments motions into geodesic synergies, and successfully predicts observed arm postures, hand trajectories, as well as their respective velocity profiles. Moreover, we show that our analysis can further be exploited to transfer arm motions to robots by reproducing individual human synergies as geodesic paths in the robot configuration space.
△ Less
Submitted 2 August, 2022;
originally announced August 2022.
-
Deep Learning Approaches to Grasp Synthesis: A Review
Authors:
Rhys Newbury,
Morris Gu,
Lachlan Chumbley,
Arsalan Mousavian,
Clemens Eppner,
Jürgen Leitner,
Jeannette Bohg,
Antonio Morales,
Tamim Asfour,
Danica Kragic,
Dieter Fox,
Akansel Cosgun
Abstract:
Grasping is the process of picking up an object by applying forces and torques at a set of contacts. Recent advances in deep-learning methods have allowed rapid progress in robotic object grasping. In this systematic review, we surveyed the publications over the last decade, with a particular interest in grasping an object using all 6 degrees of freedom of the end-effector pose. Our review found f…
▽ More
Grasping is the process of picking up an object by applying forces and torques at a set of contacts. Recent advances in deep-learning methods have allowed rapid progress in robotic object grasping. In this systematic review, we surveyed the publications over the last decade, with a particular interest in grasping an object using all 6 degrees of freedom of the end-effector pose. Our review found four common methodologies for robotic grasping: sampling-based approaches, direct regression, reinforcement learning, and exemplar approaches. Additionally, we found two `supporting methods` around grasping that use deep-learning to support the grasping process, shape approximation, and affordances. We have distilled the publications found in this systematic review (85 papers) into ten key takeaways we consider crucial for future robotic grasping and manipulation research. An online version of the survey is available at https://rhys-newbury.github.io/projects/6dof/
△ Less
Submitted 4 May, 2023; v1 submitted 6 July, 2022;
originally announced July 2022.
-
A Memory System of a Robot Cognitive Architecture and its Implementation in ArmarX
Authors:
Fabian Peller-Konrad,
Rainer Kartmann,
Christian R. G. Dreher,
Andre Meixner,
Fabian Reister,
Markus Grotz,
Tamim Asfour
Abstract:
Cognitive agents such as humans and robots perceive their environment through an abundance of sensors producing streams of data that need to be processed to generate intelligent behavior. A key question of cognition-enabled and AI-driven robotics is how to organize and manage knowledge efficiently in a cognitive robot control architecture. We argue, that memory is a central active component of suc…
▽ More
Cognitive agents such as humans and robots perceive their environment through an abundance of sensors producing streams of data that need to be processed to generate intelligent behavior. A key question of cognition-enabled and AI-driven robotics is how to organize and manage knowledge efficiently in a cognitive robot control architecture. We argue, that memory is a central active component of such architectures that mediates between semantic and sensorimotor representations, orchestrates the flow of data streams and events between different processes and provides the components of a cognitive architecture with data-driven services for the abstraction of semantics from sensorimotor data, the parametrization of symbolic plans for execution and prediction of action effects.
Based on related work, and the experience gained in developing our ARMAR humanoid robot systems, we identified conceptual and technical requirements of a memory system as central component of cognitive robot control architecture that facilitate the realization of high-level cognitive abilities such as explaining, reasoning, prospection, simulation and augmentation. Conceptually, a memory should be active, support multi-modal data representations, associate knowledge, be introspective, and have an inherently episodic structure. Technically, the memory should support a distributed design, be access-efficient and capable of long-term data storage. We introduce the memory system for our cognitive robot control architecture and its implementation in the robot software framework ArmarX. We evaluate the efficiency of the memory system with respect to transfer speeds, compression, reproduction and prediction capabilities.
△ Less
Submitted 31 January, 2023; v1 submitted 5 June, 2022;
originally announced June 2022.
-
Learning to Sequence and Blend Robot Skills via Differentiable Optimization
Authors:
Noémie Jaquier,
You Zhou,
Julia Starke,
Tamim Asfour
Abstract:
In contrast to humans and animals who naturally execute seamless motions, learning and smoothly executing sequences of actions remains a challenge in robotics. This paper introduces a novel skill-agnostic framework that learns to sequence and blend skills based on differentiable optimization. Our approach encodes sequences of previously-defined skills as quadratic programs (QP), whose parameters d…
▽ More
In contrast to humans and animals who naturally execute seamless motions, learning and smoothly executing sequences of actions remains a challenge in robotics. This paper introduces a novel skill-agnostic framework that learns to sequence and blend skills based on differentiable optimization. Our approach encodes sequences of previously-defined skills as quadratic programs (QP), whose parameters determine the relative importance of skills along the task. Seamless skill sequences are then learned from demonstrations by exploiting differentiable optimization layers and a tailored loss formulated from the QP optimality conditions. Via the use of differentiable optimization, our work offers novel perspectives on multitask control. We validate our approach in a pick-and-place scenario with planar robots, a pouring experiment with a real humanoid robot, and a bimanual sweeping task with a human model.
△ Less
Submitted 1 June, 2022;
originally announced June 2022.
-
Geometry-aware Bayesian Optimization in Robotics using Riemannian Matérn Kernels
Authors:
Noémie Jaquier,
Viacheslav Borovitskiy,
Andrei Smolensky,
Alexander Terenin,
Tamim Asfour,
Leonel Rozo
Abstract:
Bayesian optimization is a data-efficient technique which can be used for control parameter tuning, parametric policy adaptation, and structure design in robotics. Many of these problems require optimization of functions defined on non-Euclidean domains like spheres, rotation groups, or spaces of positive-definite matrices. To do so, one must place a Gaussian process prior, or equivalently define…
▽ More
Bayesian optimization is a data-efficient technique which can be used for control parameter tuning, parametric policy adaptation, and structure design in robotics. Many of these problems require optimization of functions defined on non-Euclidean domains like spheres, rotation groups, or spaces of positive-definite matrices. To do so, one must place a Gaussian process prior, or equivalently define a kernel, on the space of interest. Effective kernels typically reflect the geometry of the spaces they are defined on, but designing them is generally non-trivial. Recent work on the Riemannian Matérn kernels, based on stochastic partial differential equations and spectral theory of the Laplace-Beltrami operator, offers promising avenues towards constructing such geometry-aware kernels. In this paper, we study techniques for implementing these kernels on manifolds of interest in robotics, demonstrate their performance on a set of artificial benchmark functions, and illustrate geometry-aware Bayesian optimization for a variety of robotic applications, covering orientation control, manipulability optimization, and motion planning, while showing its improved performance.
△ Less
Submitted 17 March, 2023; v1 submitted 2 November, 2021;
originally announced November 2021.
-
Graph-based Task-specific Prediction Models for Interactions between Deformable and Rigid Objects
Authors:
Zehang Weng,
Fabian Paus,
Anastasiia Varava,
Hang Yin,
Tamim Asfour,
Danica Kragic
Abstract:
Capturing scene dynamics and predicting the future scene state is challenging but essential for robotic manipulation tasks, especially when the scene contains both rigid and deformable objects. In this work, we contribute a simulation environment and generate a novel dataset for task-specific manipulation, involving interactions between rigid objects and a deformable bag. The dataset incorporates…
▽ More
Capturing scene dynamics and predicting the future scene state is challenging but essential for robotic manipulation tasks, especially when the scene contains both rigid and deformable objects. In this work, we contribute a simulation environment and generate a novel dataset for task-specific manipulation, involving interactions between rigid objects and a deformable bag. The dataset incorporates a rich variety of scenarios including different object sizes, object numbers and manipulation actions. We approach dynamics learning by proposing an object-centric graph representation and two modules which are Active Prediction Module (APM) and Position Prediction Module (PPM) based on graph neural networks with an encode-process-decode architecture. At the inference stage, we build a two-stage model based on the learned modules for single time step prediction. We combine modules with different prediction horizons into a mixed-horizon model which addresses long-term prediction. In an ablation study, we show the benefits of the two-stage model for single time step prediction and the effectiveness of the mixed-horizon model for long-term prediction tasks. Supplementary material is available at https://github.com/wengzehang/deformable_rigid_interaction_prediction
△ Less
Submitted 4 March, 2021;
originally announced March 2021.
-
Learning to Shift Attention for Motion Generation
Authors:
You Zhou,
Jianfeng Gao,
Tamim Asfour
Abstract:
One challenge of motion generation using robot learning from demonstration techniques is that human demonstrations follow a distribution with multiple modes for one task query. Previous approaches fail to capture all modes or tend to average modes of the demonstrations and thus generate invalid trajectories. The other difficulty is the small number of demonstrations that cannot cover the entire wo…
▽ More
One challenge of motion generation using robot learning from demonstration techniques is that human demonstrations follow a distribution with multiple modes for one task query. Previous approaches fail to capture all modes or tend to average modes of the demonstrations and thus generate invalid trajectories. The other difficulty is the small number of demonstrations that cannot cover the entire working space. To overcome this problem, a motion generation model with extrapolation ability is needed. Previous works restrict task queries as local frames and learn representations in local frames. We propose a model to solve both problems. For multiple modes, we suggest to learn local latent representations of motion trajectories with a density estimation method based on real-valued non-volume preserving (RealNVP) transformations that provides a set of powerful, stably invertible, and learnable transformations. To improve the extrapolation ability, we propose to shift the attention of the robot from one local frame to another during the task execution. In experiments, we consider the docking problem used also in previous works where a trajectory has to be generated to connect two dockers without collision. We increase complexity of the task and show that the proposed method outperforms other approaches. In addition, we evaluate the approach in real robot experiments.
△ Less
Submitted 24 February, 2021;
originally announced February 2021.
-
Object and Relation Centric Representations for Push Effect Prediction
Authors:
Ahmet E. Tekden,
Aykut Erdem,
Erkut Erdem,
Tamim Asfour,
Emre Ugur
Abstract:
Pushing is an essential non-prehensile manipulation skill used for tasks ranging from pre-grasp manipulation to scene rearrangement, reasoning about object relations in the scene, and thus pushing actions have been widely studied in robotics. The effective use of pushing actions often requires an understanding of the dynamics of the manipulated objects and adaptation to the discrepancies between p…
▽ More
Pushing is an essential non-prehensile manipulation skill used for tasks ranging from pre-grasp manipulation to scene rearrangement, reasoning about object relations in the scene, and thus pushing actions have been widely studied in robotics. The effective use of pushing actions often requires an understanding of the dynamics of the manipulated objects and adaptation to the discrepancies between prediction and reality. For this reason, effect prediction and parameter estimation with pushing actions have been heavily investigated in the literature. However, current approaches are limited because they either model systems with a fixed number of objects or use image-based representations whose outputs are not very interpretable and quickly accumulate errors. In this paper, we propose a graph neural network based framework for effect prediction and parameter estimation of pushing actions by modeling object relations based on contacts or articulations. Our framework is validated both in real and simulated environments containing different shaped multi-part objects connected via different types of joints and objects with different masses, and it outperforms image-based representations on physics prediction. Our approach enables the robot to predict and adapt the effect of a pushing action as it observes the scene. It can also be used for tool manipulation with never-seen tools. Further, we demonstrate 6D effect prediction in the lever-up action in the context of robot-based hard-disk disassembly.
△ Less
Submitted 22 February, 2023; v1 submitted 3 February, 2021;
originally announced February 2021.
-
Uncertainty-aware Contact-safe Model-based Reinforcement Learning
Authors:
Cheng-Yu Kuo,
Andreas Schaarschmidt,
Yunduan Cui,
Tamim Asfour,
Takamitsu Matsubara
Abstract:
This letter presents contact-safe Model-based Reinforcement Learning (MBRL) for robot applications that achieves contact-safe behaviors in the learning process. In typical MBRL, we cannot expect the data-driven model to generate accurate and reliable policies to the intended robotic tasks during the learning process due to sample scarcity. Operating these unreliable policies in a contact-rich envi…
▽ More
This letter presents contact-safe Model-based Reinforcement Learning (MBRL) for robot applications that achieves contact-safe behaviors in the learning process. In typical MBRL, we cannot expect the data-driven model to generate accurate and reliable policies to the intended robotic tasks during the learning process due to sample scarcity. Operating these unreliable policies in a contact-rich environment could cause damage to the robot and its surroundings. To alleviate the risk of causing damage through unexpected intensive physical contacts, we present the contact-safe MBRL that associates the probabilistic Model Predictive Control's (pMPC) control limits with the model uncertainty so that the allowed acceleration of controlled behavior is adjusted according to learning progress. Control planning with such uncertainty-aware control limits is formulated as a deterministic MPC problem using a computation-efficient approximated GP dynamics and an approximated inference technique. Our approach's effectiveness is evaluated through bowl mixing tasks with simulated and real robots, scooping tasks with a real robot as examples of contact-rich manipulation skills. (video: https://youtu.be/sdhHP3NhYi0)
△ Less
Submitted 9 March, 2021; v1 submitted 16 October, 2020;
originally announced October 2020.
-
A Soft Humanoid Hand with In-Finger Visual Perception
Authors:
Felix Hundhausen,
Julia Starke,
Tamim Asfour
Abstract:
We present a novel underactued humanoid five finger soft hand, the KIT \softhand, which is equipped with cameras in the fingertips and integrates a high performance embedded system for visual processing and control. We describe the actuation mechanism of the hand and the tendon-driven soft finger design with internally routed high-bandwidth flat-flex cables. For efficient on-board parallel process…
▽ More
We present a novel underactued humanoid five finger soft hand, the KIT \softhand, which is equipped with cameras in the fingertips and integrates a high performance embedded system for visual processing and control. We describe the actuation mechanism of the hand and the tendon-driven soft finger design with internally routed high-bandwidth flat-flex cables. For efficient on-board parallel processing of visual data from the cameras in each fingertip, we present a hybrid embedded architecture consisting of a field programmable logic array (FPGA) and a microcontroller that allows the realization of visual object segmentation based on convolutional neural networks.
We evaluate the hand design by conducting durability experiments with one finger and quantify the grasp performance in terms of grasping force, speed and grasp success. The results show that the hand exhibits a grasp force of 31.8 N and a mechanical durability of the finger of more than 15.000 closing cycles. Finally, we evaluate the accuracy of visual object segmentation during the different phases of the grasping process using five different objects. Hereby, an accuracy above 90 % can be achieved.
△ Less
Submitted 5 June, 2020;
originally announced June 2020.
-
Learning Compliance Adaptation in Contact-Rich Manipulation
Authors:
Jianfeng Gao,
You Zhou,
Tamim Asfour
Abstract:
Compliant robot behavior is crucial for the realization of contact-rich manipulation tasks. In such tasks, it is important to ensure a high stiffness and force tracking accuracy during normal task execution as well as rapid adaptation and complaint behavior to react to abnormal situations and changes. In this paper, we propose a novel approach for learning predictive models of force profiles requi…
▽ More
Compliant robot behavior is crucial for the realization of contact-rich manipulation tasks. In such tasks, it is important to ensure a high stiffness and force tracking accuracy during normal task execution as well as rapid adaptation and complaint behavior to react to abnormal situations and changes. In this paper, we propose a novel approach for learning predictive models of force profiles required for contact-rich tasks. Such models allow detecting unexpected situations and facilitates better adaptive control. The approach combines an anomaly detection based on Bidirectional Gated Recurrent Units (Bi-GRU) and an adaptive force/impedance controller. We evaluated the approach in simulated and real world experiments on a humanoid robot.The results show that the approach allow simultaneous high tracking accuracy of desired motions and force profile as well as the adaptation to force perturbations due to physical human interaction.
△ Less
Submitted 1 May, 2020;
originally announced May 2020.
-
Learning Visual Dynamics Models of Rigid Objects using Relational Inductive Biases
Authors:
Fabio Ferreira,
Lin Shao,
Tamim Asfour,
Jeannette Bohg
Abstract:
Endowing robots with human-like physical reasoning abilities remains challenging. We argue that existing methods often disregard spatio-temporal relations and by using Graph Neural Networks (GNNs) that incorporate a relational inductive bias, we can shift the learning process towards exploiting relations. In this work, we learn action-conditional forward dynamics models of a simulated manipulation…
▽ More
Endowing robots with human-like physical reasoning abilities remains challenging. We argue that existing methods often disregard spatio-temporal relations and by using Graph Neural Networks (GNNs) that incorporate a relational inductive bias, we can shift the learning process towards exploiting relations. In this work, we learn action-conditional forward dynamics models of a simulated manipulation task from visual observations involving cluttered and irregularly shaped objects. We investigate two GNN approaches and empirically assess their capability to generalize to scenarios with novel and an increasing number of objects. The first, Graph Networks (GN) based approach, considers explicitly defined edge attributes and not only does it consistently underperform an auto-encoder baseline that we modified to predict future states, our results indicate how different edge attributes can significantly influence the predictions. Consequently, we develop the Auto-Predictor that does not rely on explicitly defined edge attributes. It outperforms the baseline and the GN-based models. Overall, our results show the sensitivity of GNN-based approaches to the task representation, the efficacy of relational inductive biases and advocate choosing lightweight approaches that implicitly reason about relations over ones that leave these decisions to human designers.
△ Less
Submitted 23 October, 2019; v1 submitted 9 September, 2019;
originally announced September 2019.
-
Learning Object-Action Relations from Bimanual Human Demonstration Using Graph Networks
Authors:
Christian R. G. Dreher,
Mirko Wächter,
Tamim Asfour
Abstract:
Recognizing human actions is a vital task for a humanoid robot, especially in domains like programming by demonstration. Previous approaches on action recognition primarily focused on the overall prevalent action being executed, but we argue that bimanual human motion cannot always be described sufficiently with a single action label. We present a system for frame-wise action classification and se…
▽ More
Recognizing human actions is a vital task for a humanoid robot, especially in domains like programming by demonstration. Previous approaches on action recognition primarily focused on the overall prevalent action being executed, but we argue that bimanual human motion cannot always be described sufficiently with a single action label. We present a system for frame-wise action classification and segmentation in bimanual human demonstrations. The system extracts symbolic spatial object relations from raw RGB-D video data captured from the robot's point of view in order to build graph-based scene representations. To learn object-action relations, a graph network classifier is trained using these representations together with ground truth action labels to predict the action executed by each hand.
We evaluated the proposed classifier on a new RGB-D video dataset showing daily action sequences focusing on bimanual manipulation actions. It consists of 6 subjects performing 9 tasks with 10 repetitions each, which leads to 540 video recordings with 2 hours and 18 minutes total playtime and per-hand ground truth action labels for each frame. We show that the classifier is able to reliably identify (action classification macro F1-score of 0.86) the true executed action of each hand within its top 3 predictions on a frame-by-frame basis without prior temporal action segmentation.
△ Less
Submitted 12 September, 2019; v1 submitted 22 August, 2019;
originally announced August 2019.
-
Noise Regularization for Conditional Density Estimation
Authors:
Jonas Rothfuss,
Fabio Ferreira,
Simon Boehm,
Simon Walther,
Maxim Ulrich,
Tamim Asfour,
Andreas Krause
Abstract:
Modelling statistical relationships beyond the conditional mean is crucial in many settings. Conditional density estimation (CDE) aims to learn the full conditional probability density from data. Though highly expressive, neural network based CDE models can suffer from severe over-fitting when trained with the maximum likelihood objective. Due to the inherent structure of such models, classical re…
▽ More
Modelling statistical relationships beyond the conditional mean is crucial in many settings. Conditional density estimation (CDE) aims to learn the full conditional probability density from data. Though highly expressive, neural network based CDE models can suffer from severe over-fitting when trained with the maximum likelihood objective. Due to the inherent structure of such models, classical regularization approaches in the parameter space are rendered ineffective. To address this issue, we develop a model-agnostic noise regularization method for CDE that adds random perturbations to the data during training. We demonstrate that the proposed approach corresponds to a smoothness regularization and prove its asymptotic consistency. In our experiments, noise regularization significantly and consistently outperforms other regularization methods across seven data sets and three CDE models. The effectiveness of noise regularization makes neural network based CDE the preferable method over previous non- and semi-parametric approaches, even when training data is scarce.
△ Less
Submitted 14 February, 2020; v1 submitted 21 July, 2019;
originally announced July 2019.
-
ProMP: Proximal Meta-Policy Search
Authors:
Jonas Rothfuss,
Dennis Lee,
Ignasi Clavera,
Tamim Asfour,
Pieter Abbeel
Abstract:
Credit assignment in Meta-reinforcement learning (Meta-RL) is still poorly understood. Existing methods either neglect credit assignment to pre-adaptation behavior or implement it naively. This leads to poor sample-efficiency during meta-training as well as ineffective task identification strategies. This paper provides a theoretical analysis of credit assignment in gradient-based Meta-RL. Buildin…
▽ More
Credit assignment in Meta-reinforcement learning (Meta-RL) is still poorly understood. Existing methods either neglect credit assignment to pre-adaptation behavior or implement it naively. This leads to poor sample-efficiency during meta-training as well as ineffective task identification strategies. This paper provides a theoretical analysis of credit assignment in gradient-based Meta-RL. Building on the gained insights we develop a novel meta-learning algorithm that overcomes both the issue of poor credit assignment and previous difficulties in estimating meta-policy gradients. By controlling the statistical distance of both pre-adaptation and adapted policies during meta-policy search, the proposed algorithm endows efficient and stable meta-learning. Our approach leads to superior pre-adaptation policy behavior and consistently outperforms previous Meta-RL algorithms in sample-efficiency, wall-clock time, and asymptotic performance.
△ Less
Submitted 11 February, 2022; v1 submitted 15 October, 2018;
originally announced October 2018.
-
A Framework for Evaluating Motion Segmentation Algorithms
Authors:
Christian R. G. Dreher,
Nicklas Kulp,
Christian Mandery,
Mirko Wächter,
Tamim Asfour
Abstract:
There have been many proposals for algorithms segmenting human whole-body motion in the literature. However, the wide range of use cases, datasets, and quality measures that were used for the evaluation render the comparison of algorithms challenging. In this paper, we introduce a framework that puts motion segmentation algorithms on a unified testing ground and provides a possibility to allow com…
▽ More
There have been many proposals for algorithms segmenting human whole-body motion in the literature. However, the wide range of use cases, datasets, and quality measures that were used for the evaluation render the comparison of algorithms challenging. In this paper, we introduce a framework that puts motion segmentation algorithms on a unified testing ground and provides a possibility to allow comparing them. The testing ground features both a set of quality measures known from the literature and a novel approach tailored to the evaluation of motion segmentation algorithms, termed Integrated Kernel approach. Datasets of motion recordings, provided with a ground truth, are included as well. They are labelled in a new way, which hierarchically organises the ground truth, to cover different use cases that segmentation algorithms can possess. The framework and datasets are publicly available and are intended to represent a service for the community regarding the comparison and evaluation of existing and new motion segmentation algorithms.
△ Less
Submitted 30 September, 2018;
originally announced October 2018.
-
Model-Based Reinforcement Learning via Meta-Policy Optimization
Authors:
Ignasi Clavera,
Jonas Rothfuss,
John Schulman,
Yasuhiro Fujita,
Tamim Asfour,
Pieter Abbeel
Abstract:
Model-based reinforcement learning approaches carry the promise of being data efficient. However, due to challenges in learning dynamics models that sufficiently match the real-world dynamics, they struggle to achieve the same asymptotic performance as model-free methods. We propose Model-Based Meta-Policy-Optimization (MB-MPO), an approach that foregoes the strong reliance on accurate learned dyn…
▽ More
Model-based reinforcement learning approaches carry the promise of being data efficient. However, due to challenges in learning dynamics models that sufficiently match the real-world dynamics, they struggle to achieve the same asymptotic performance as model-free methods. We propose Model-Based Meta-Policy-Optimization (MB-MPO), an approach that foregoes the strong reliance on accurate learned dynamics models. Using an ensemble of learned dynamic models, MB-MPO meta-learns a policy that can quickly adapt to any model in the ensemble with one policy gradient step. This steers the meta-policy towards internalizing consistent dynamics predictions among the ensemble while shifting the burden of behaving optimally w.r.t. the model discrepancies towards the adaptation step. Our experiments show that MB-MPO is more robust to model imperfections than previous model-based approaches. Finally, we demonstrate that our approach is able to match the asymptotic performance of model-free methods while requiring significantly less experience.
△ Less
Submitted 13 September, 2018;
originally announced September 2018.
-
Introducing the Simulated Flying Shapes and Simulated Planar Manipulator Datasets
Authors:
Fabio Ferreira,
Jonas Rothfuss,
Eren Erdal Aksoy,
You Zhou,
Tamim Asfour
Abstract:
We release two artificial datasets, Simulated Flying Shapes and Simulated Planar Manipulator that allow to test the learning ability of video processing systems. In particular, the dataset is meant as a tool which allows to easily assess the sanity of deep neural network models that aim to encode, reconstruct or predict video frame sequences. The datasets each consist of 90000 videos. The Simulate…
▽ More
We release two artificial datasets, Simulated Flying Shapes and Simulated Planar Manipulator that allow to test the learning ability of video processing systems. In particular, the dataset is meant as a tool which allows to easily assess the sanity of deep neural network models that aim to encode, reconstruct or predict video frame sequences. The datasets each consist of 90000 videos. The Simulated Flying Shapes dataset comprises scenes showing two objects of equal shape (rectangle, triangle and circle) and size in which one object approaches its counterpart. The Simulated Planar Manipulator shows a 3-DOF planar manipulator that executes a pick-and-place task in which it has to place a size-varying circle on a squared platform. Different from other widely used datasets such as moving MNIST [1], [2], the two presented datasets involve goal-oriented tasks (e.g. the manipulator grasping an object and placing it on a platform), rather than showing random movements. This makes our datasets more suitable for testing prediction capabilities and the learning of sophisticated motions by a machine learning model. This technical document aims at providing an introduction into the usage of both datasets.
△ Less
Submitted 2 July, 2018;
originally announced July 2018.