-
CUPID: Curating Data your Robot Loves with Influence Functions
Authors:
Christopher Agia,
Rohan Sinha,
Jingyun Yang,
Rika Antonova,
Marco Pavone,
Haruki Nishimura,
Masha Itkina,
Jeannette Bohg
Abstract:
In robot imitation learning, policy performance is tightly coupled with the quality and composition of the demonstration data. Yet, developing a precise understanding of how individual demonstrations contribute to downstream outcomes - such as closed-loop task success or failure - remains a persistent challenge. We propose CUPID, a robot data curation method based on a novel influence function-the…
▽ More
In robot imitation learning, policy performance is tightly coupled with the quality and composition of the demonstration data. Yet, developing a precise understanding of how individual demonstrations contribute to downstream outcomes - such as closed-loop task success or failure - remains a persistent challenge. We propose CUPID, a robot data curation method based on a novel influence function-theoretic formulation for imitation learning policies. Given a set of evaluation rollouts, CUPID estimates the influence of each training demonstration on the policy's expected return. This enables ranking and selection of demonstrations according to their impact on the policy's closed-loop performance. We use CUPID to curate data by 1) filtering out training demonstrations that harm policy performance and 2) subselecting newly collected trajectories that will most improve the policy. Extensive simulated and hardware experiments show that our approach consistently identifies which data drives test-time performance. For example, training with less than 33% of curated data can yield state-of-the-art diffusion policies on the simulated RoboMimic benchmark, with similar gains observed in hardware. Furthermore, hardware experiments show that our method can identify robust strategies under distribution shift, isolate spurious correlations, and even enhance the post-training of generalist robot policies. Additional materials are made available at: https://cupid-curation.github.io.
△ Less
Submitted 23 June, 2025;
originally announced June 2025.
-
HoMeR: Learning In-the-Wild Mobile Manipulation via Hybrid Imitation and Whole-Body Control
Authors:
Priya Sundaresan,
Rhea Malhotra,
Phillip Miao,
Jingyun Yang,
Jimmy Wu,
Hengyuan Hu,
Rika Antonova,
Francis Engelmann,
Dorsa Sadigh,
Jeannette Bohg
Abstract:
We introduce HoMeR, an imitation learning framework for mobile manipulation that combines whole-body control with hybrid action modes that handle both long-range and fine-grained motion, enabling effective performance on realistic in-the-wild tasks. At its core is a fast, kinematics-based whole-body controller that maps desired end-effector poses to coordinated motion across the mobile base and ar…
▽ More
We introduce HoMeR, an imitation learning framework for mobile manipulation that combines whole-body control with hybrid action modes that handle both long-range and fine-grained motion, enabling effective performance on realistic in-the-wild tasks. At its core is a fast, kinematics-based whole-body controller that maps desired end-effector poses to coordinated motion across the mobile base and arm. Within this reduced end-effector action space, HoMeR learns to switch between absolute pose predictions for long-range movement and relative pose predictions for fine-grained manipulation, offloading low-level coordination to the controller and focusing learning on task-level decisions. We deploy HoMeR on a holonomic mobile manipulator with a 7-DoF arm in a real home. We compare HoMeR to baselines without hybrid actions or whole-body control across 3 simulated and 3 real household tasks such as opening cabinets, sweeping trash, and rearranging pillows. Across tasks, HoMeR achieves an overall success rate of 79.17% using just 20 demonstrations per task, outperforming the next best baseline by 29.17 on average. HoMeR is also compatible with vision-language models and can leverage their internet-scale priors to better generalize to novel object appearances, layouts, and cluttered scenes. In summary, HoMeR moves beyond tabletop settings and demonstrates a scalable path toward sample-efficient, generalizable manipulation in everyday indoor spaces. Code, videos, and supplementary material are available at: http://homer-manip.github.io
△ Less
Submitted 1 June, 2025;
originally announced June 2025.
-
Mobi-$π$: Mobilizing Your Robot Learning Policy
Authors:
Jingyun Yang,
Isabella Huang,
Brandon Vu,
Max Bajracharya,
Rika Antonova,
Jeannette Bohg
Abstract:
Learned visuomotor policies are capable of performing increasingly complex manipulation tasks. However, most of these policies are trained on data collected from limited robot positions and camera viewpoints. This leads to poor generalization to novel robot positions, which limits the use of these policies on mobile platforms, especially for precise tasks like pressing buttons or turning faucets.…
▽ More
Learned visuomotor policies are capable of performing increasingly complex manipulation tasks. However, most of these policies are trained on data collected from limited robot positions and camera viewpoints. This leads to poor generalization to novel robot positions, which limits the use of these policies on mobile platforms, especially for precise tasks like pressing buttons or turning faucets. In this work, we formulate the policy mobilization problem: find a mobile robot base pose in a novel environment that is in distribution with respect to a manipulation policy trained on a limited set of camera viewpoints. Compared to retraining the policy itself to be more robust to unseen robot base pose initializations, policy mobilization decouples navigation from manipulation and thus does not require additional demonstrations. Crucially, this problem formulation complements existing efforts to improve manipulation policy robustness to novel viewpoints and remains compatible with them. To study policy mobilization, we introduce the Mobi-$π$ framework, which includes: (1) metrics that quantify the difficulty of mobilizing a given policy, (2) a suite of simulated mobile manipulation tasks based on RoboCasa to evaluate policy mobilization, (3) visualization tools for analysis, and (4) several baseline methods. We also propose a novel approach that bridges navigation and manipulation by optimizing the robot's base pose to align with an in-distribution base pose for a learned policy. Our approach utilizes 3D Gaussian Splatting for novel view synthesis, a score function to evaluate pose suitability, and sampling-based optimization to identify optimal robot poses. We show that our approach outperforms baselines in both simulation and real-world environments, demonstrating its effectiveness for policy mobilization.
△ Less
Submitted 29 May, 2025;
originally announced May 2025.
-
Causal-PIK: Causality-based Physical Reasoning with a Physics-Informed Kernel
Authors:
Carlota Parés-Morlans,
Michelle Yi,
Claire Chen,
Sarah A. Wu,
Rika Antonova,
Tobias Gerstenberg,
Jeannette Bohg
Abstract:
Tasks that involve complex interactions between objects with unknown dynamics make planning before execution difficult. These tasks require agents to iteratively improve their actions after actively exploring causes and effects in the environment. For these type of tasks, we propose Causal-PIK, a method that leverages Bayesian optimization to reason about causal interactions via a Physics-Informed…
▽ More
Tasks that involve complex interactions between objects with unknown dynamics make planning before execution difficult. These tasks require agents to iteratively improve their actions after actively exploring causes and effects in the environment. For these type of tasks, we propose Causal-PIK, a method that leverages Bayesian optimization to reason about causal interactions via a Physics-Informed Kernel to help guide efficient search for the best next action. Experimental results on Virtual Tools and PHYRE physical reasoning benchmarks show that Causal-PIK outperforms state-of-the-art results, requiring fewer actions to reach the goal. We also compare Causal-PIK to human studies, including results from a new user study we conducted on the PHYRE benchmark. We find that Causal-PIK remains competitive on tasks that are very challenging, even for human problem-solvers.
△ Less
Submitted 30 May, 2025; v1 submitted 28 May, 2025;
originally announced May 2025.
-
Deformable Cargo Transport in Microgravity with Astrobee
Authors:
Daniel Morton,
Rika Antonova,
Brian Coltin,
Marco Pavone,
Jeannette Bohg
Abstract:
We present pyastrobee: a simulation environment and control stack for Astrobee in Python, with an emphasis on cargo manipulation and transport tasks. We also demonstrate preliminary success from a sampling-based MPC controller, using reduced-order models of NASA's cargo transfer bag (CTB) to control a high-order deformable finite element model. Our code is open-source, fully documented, and availa…
▽ More
We present pyastrobee: a simulation environment and control stack for Astrobee in Python, with an emphasis on cargo manipulation and transport tasks. We also demonstrate preliminary success from a sampling-based MPC controller, using reduced-order models of NASA's cargo transfer bag (CTB) to control a high-order deformable finite element model. Our code is open-source, fully documented, and available at https://danielpmorton.github.io/pyastrobee
△ Less
Submitted 2 May, 2025;
originally announced May 2025.
-
A Generic Hybrid Framework for 2D Visual Reconstruction
Authors:
Daniel Rika,
Dror Sholomon,
Eli David,
Alexandre Pais,
Nathan S. Netanyahu
Abstract:
This paper presents a versatile hybrid framework for addressing 2D real-world reconstruction tasks formulated as jigsaw puzzle problems (JPPs) with square, non-overlapping pieces. Our approach integrates a deep learning (DL)-based compatibility measure (CM) model that evaluates pairs of puzzle pieces holistically, rather than focusing solely on their adjacent edges as traditionally done. This DL-b…
▽ More
This paper presents a versatile hybrid framework for addressing 2D real-world reconstruction tasks formulated as jigsaw puzzle problems (JPPs) with square, non-overlapping pieces. Our approach integrates a deep learning (DL)-based compatibility measure (CM) model that evaluates pairs of puzzle pieces holistically, rather than focusing solely on their adjacent edges as traditionally done. This DL-based CM is paired with an optimized genetic algorithm (GA)-based solver, which iteratively searches for a global optimal arrangement using the pairwise CM scores of the puzzle pieces. Extensive experimental results highlight the framework's adaptability and robustness across multiple real-world domains. Notably, our unique hybrid methodology achieves state-of-the-art (SOTA) results in reconstructing Portuguese tile panels and large degraded puzzles with eroded boundaries.
△ Less
Submitted 31 January, 2025;
originally announced January 2025.
-
Concept Learning in the Wild: Towards Algorithmic Understanding of Neural Networks
Authors:
Elad Shoham,
Hadar Cohen,
Khalil Wattad,
Havana Rika,
Dan Vilenchik
Abstract:
Explainable AI (XAI) methods typically focus on identifying essential input features or more abstract concepts for tasks like image or text classification. However, for algorithmic tasks like combinatorial optimization, these concepts may depend not only on the input but also on the current state of the network, like in the graph neural networks (GNN) case. This work studies concept learning for a…
▽ More
Explainable AI (XAI) methods typically focus on identifying essential input features or more abstract concepts for tasks like image or text classification. However, for algorithmic tasks like combinatorial optimization, these concepts may depend not only on the input but also on the current state of the network, like in the graph neural networks (GNN) case. This work studies concept learning for an existing GNN model trained to solve Boolean satisfiability (SAT). \textcolor{black}{Our analysis reveals that the model learns key concepts matching those guiding human-designed SAT heuristics, particularly the notion of 'support.' We demonstrate that these concepts are encoded in the top principal components (PCs) of the embedding's covariance matrix, allowing for unsupervised discovery. Using sparse PCA, we establish the minimality of these concepts and show their teachability through a simplified GNN. Two direct applications of our framework are (a) We improve the convergence time of the classical WalkSAT algorithm and (b) We use the discovered concepts to "reverse-engineer" the black-box GNN and rewrite it as a white-box textbook algorithm. Our results highlight the potential of concept learning in understanding and enhancing algorithmic neural networks for combinatorial optimization tasks.
△ Less
Submitted 15 December, 2024;
originally announced December 2024.
-
Unpacking Failure Modes of Generative Policies: Runtime Monitoring of Consistency and Progress
Authors:
Christopher Agia,
Rohan Sinha,
Jingyun Yang,
Zi-ang Cao,
Rika Antonova,
Marco Pavone,
Jeannette Bohg
Abstract:
Robot behavior policies trained via imitation learning are prone to failure under conditions that deviate from their training data. Thus, algorithms that monitor learned policies at test time and provide early warnings of failure are necessary to facilitate scalable deployment. We propose Sentinel, a runtime monitoring framework that splits the detection of failures into two complementary categori…
▽ More
Robot behavior policies trained via imitation learning are prone to failure under conditions that deviate from their training data. Thus, algorithms that monitor learned policies at test time and provide early warnings of failure are necessary to facilitate scalable deployment. We propose Sentinel, a runtime monitoring framework that splits the detection of failures into two complementary categories: 1) Erratic failures, which we detect using statistical measures of temporal action consistency, and 2) task progression failures, where we use Vision Language Models (VLMs) to detect when the policy confidently and consistently takes actions that do not solve the task. Our approach has two key strengths. First, because learned policies exhibit diverse failure modes, combining complementary detectors leads to significantly higher accuracy at failure detection. Second, using a statistical temporal action consistency measure ensures that we quickly detect when multimodal, generative policies exhibit erratic behavior at negligible computational cost. In contrast, we only use VLMs to detect failure modes that are less time-sensitive. We demonstrate our approach in the context of diffusion policies trained on robotic mobile manipulation domains in both simulation and the real world. By unifying temporal consistency detection and VLM runtime monitoring, Sentinel detects 18% more failures than using either of the two detectors alone and significantly outperforms baselines, thus highlighting the importance of assigning specialized detectors to complementary categories of failure. Qualitative results are made available at https://sites.google.com/stanford.edu/sentinel.
△ Less
Submitted 10 October, 2024; v1 submitted 6 October, 2024;
originally announced October 2024.
-
Towards Perceived Security, Perceived Privacy, and the Universal Design of E-Payment Applications
Authors:
Urvashi Kishnani,
Isabella Cardenas,
Jailene Castillo,
Rosalyn Conry,
Lukas Rodwin,
Rika Ruiz,
Matthew Walther,
Sanchari Das
Abstract:
With the growth of digital monetary transactions and cashless payments, encouraged by the COVID-19 pandemic, use of e-payment applications is on the rise. It is thus imperative to understand and evaluate the current posture of e-payment applications from three major user-facing angles: security, privacy, and usability. To this, we created a high-fidelity prototype of an e-payment application that…
▽ More
With the growth of digital monetary transactions and cashless payments, encouraged by the COVID-19 pandemic, use of e-payment applications is on the rise. It is thus imperative to understand and evaluate the current posture of e-payment applications from three major user-facing angles: security, privacy, and usability. To this, we created a high-fidelity prototype of an e-payment application that encompassed features that we wanted to test with users. We then conducted a pilot study where we recruited 12 participants who tested our prototype. We find that both security and privacy are important for users of e-payment applications. Additionally, some participants perceive the strength of security and privacy based on the usability of the application. We provide recommendations such as universal design of e-payment applications.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
EquiBot: SIM(3)-Equivariant Diffusion Policy for Generalizable and Data Efficient Learning
Authors:
Jingyun Yang,
Zi-ang Cao,
Congyue Deng,
Rika Antonova,
Shuran Song,
Jeannette Bohg
Abstract:
Building effective imitation learning methods that enable robots to learn from limited data and still generalize across diverse real-world environments is a long-standing problem in robot learning. We propose Equibot, a robust, data-efficient, and generalizable approach for robot manipulation task learning. Our approach combines SIM(3)-equivariant neural network architectures with diffusion models…
▽ More
Building effective imitation learning methods that enable robots to learn from limited data and still generalize across diverse real-world environments is a long-standing problem in robot learning. We propose Equibot, a robust, data-efficient, and generalizable approach for robot manipulation task learning. Our approach combines SIM(3)-equivariant neural network architectures with diffusion models. This ensures that our learned policies are invariant to changes in scale, rotation, and translation, enhancing their applicability to unseen environments while retaining the benefits of diffusion-based policy learning such as multi-modality and robustness. We show on a suite of 6 simulation tasks that our proposed method reduces the data requirements and improves generalization to novel scenarios. In the real world, with 10 variations of 6 mobile manipulation tasks, we show that our method can easily generalize to novel objects and scenes after learning from just 5 minutes of human demonstrations in each task.
△ Less
Submitted 29 October, 2024; v1 submitted 1 July, 2024;
originally announced July 2024.
-
Integrating Supervised Extractive and Generative Language Models for Suicide Risk Evidence Summarization
Authors:
Rika Tanaka,
Yusuke Fukazawa
Abstract:
We propose a method that integrates supervised extractive and generative language models for providing supporting evidence of suicide risk in the CLPsych 2024 shared task. Our approach comprises three steps. Initially, we construct a BERT-based model for estimating sentence-level suicide risk and negative sentiment. Next, we precisely identify high suicide risk sentences by emphasizing elevated pr…
▽ More
We propose a method that integrates supervised extractive and generative language models for providing supporting evidence of suicide risk in the CLPsych 2024 shared task. Our approach comprises three steps. Initially, we construct a BERT-based model for estimating sentence-level suicide risk and negative sentiment. Next, we precisely identify high suicide risk sentences by emphasizing elevated probabilities of both suicide risk and negative sentiment. Finally, we integrate generative summaries using the MentaLLaMa framework and extractive summaries from identified high suicide risk sentences and a specialized dictionary of suicidal risk words. SophiaADS, our team, achieved 1st place for highlight extraction and ranked 10th for summary generation, both based on recall and consistency metrics, respectively.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
EquivAct: SIM(3)-Equivariant Visuomotor Policies beyond Rigid Object Manipulation
Authors:
Jingyun Yang,
Congyue Deng,
Jimmy Wu,
Rika Antonova,
Leonidas Guibas,
Jeannette Bohg
Abstract:
If a robot masters folding a kitchen towel, we would expect it to master folding a large beach towel. However, existing policy learning methods that rely on data augmentation still don't guarantee such generalization. Our insight is to add equivariance to both the visual object representation and policy architecture. We propose EquivAct which utilizes SIM(3)-equivariant network structures that gua…
▽ More
If a robot masters folding a kitchen towel, we would expect it to master folding a large beach towel. However, existing policy learning methods that rely on data augmentation still don't guarantee such generalization. Our insight is to add equivariance to both the visual object representation and policy architecture. We propose EquivAct which utilizes SIM(3)-equivariant network structures that guarantee generalization across all possible object translations, 3D rotations, and scales by construction. EquivAct is trained in two phases. We first pre-train a SIM(3)-equivariant visual representation on simulated scene point clouds. Then, we learn a SIM(3)-equivariant visuomotor policy using a small amount of source task demonstrations. We show that the learned policy directly transfers to objects that substantially differ from demonstrations in scale, position, and orientation. We evaluate our method in three manipulation tasks involving deformable and articulated objects, going beyond typical rigid object manipulation tasks considered in prior work. We conduct experiments both in simulation and in reality. For real robot experiments, our method uses 20 human demonstrations of a tabletop task and transfers zero-shot to a mobile manipulation task in a much larger setup. Experiments confirm that our contrastive pre-training procedure and equivariant architecture offer significant improvements over prior work. Project website: https://equivact.github.io
△ Less
Submitted 14 May, 2024; v1 submitted 24 October, 2023;
originally announced October 2023.
-
TidyBot: Personalized Robot Assistance with Large Language Models
Authors:
Jimmy Wu,
Rika Antonova,
Adam Kan,
Marion Lepert,
Andy Zeng,
Shuran Song,
Jeannette Bohg,
Szymon Rusinkiewicz,
Thomas Funkhouser
Abstract:
For a robot to personalize physical assistance effectively, it must learn user preferences that can be generally reapplied to future scenarios. In this work, we investigate personalization of household cleanup with robots that can tidy up rooms by picking up objects and putting them away. A key challenge is determining the proper place to put each object, as people's preferences can vary greatly d…
▽ More
For a robot to personalize physical assistance effectively, it must learn user preferences that can be generally reapplied to future scenarios. In this work, we investigate personalization of household cleanup with robots that can tidy up rooms by picking up objects and putting them away. A key challenge is determining the proper place to put each object, as people's preferences can vary greatly depending on personal taste or cultural background. For instance, one person may prefer storing shirts in the drawer, while another may prefer them on the shelf. We aim to build systems that can learn such preferences from just a handful of examples via prior interactions with a particular person. We show that robots can combine language-based planning and perception with the few-shot summarization capabilities of large language models (LLMs) to infer generalized user preferences that are broadly applicable to future interactions. This approach enables fast adaptation and achieves 91.2% accuracy on unseen objects in our benchmark dataset. We also demonstrate our approach on a real-world mobile manipulator called TidyBot, which successfully puts away 85.0% of objects in real-world test scenarios.
△ Less
Submitted 11 October, 2023; v1 submitted 9 May, 2023;
originally announced May 2023.
-
Edge2Vec: A High Quality Embedding for the Jigsaw Puzzle Problem
Authors:
Daniel Rika,
Dror Sholomon,
Eli David,
Nathan S. Netanyahu
Abstract:
Pairwise compatibility measure (CM) is a key component in solving the jigsaw puzzle problem (JPP) and many of its recently proposed variants. With the rapid rise of deep neural networks (DNNs), a trade-off between performance (i.e., accuracy) and computational efficiency has become a very significant issue. Whereas an end-to-end DNN-based CM model exhibits high performance, it becomes virtually in…
▽ More
Pairwise compatibility measure (CM) is a key component in solving the jigsaw puzzle problem (JPP) and many of its recently proposed variants. With the rapid rise of deep neural networks (DNNs), a trade-off between performance (i.e., accuracy) and computational efficiency has become a very significant issue. Whereas an end-to-end DNN-based CM model exhibits high performance, it becomes virtually infeasible on very large puzzles, due to its highly intensive computation. On the other hand, exploiting the concept of embeddings to alleviate significantly the computational efficiency, has resulted in degraded performance, according to recent studies. This paper derives an advanced CM model (based on modified embeddings and a new loss function, called hard batch triplet loss) for closing the above gap between speed and accuracy; namely a CM model that achieves SOTA results in terms of performance and efficiency combined. We evaluated our newly derived CM on three commonly used datasets, and obtained a reconstruction improvement of 5.8% and 19.5% for so-called Type-1 and Type-2 problem variants, respectively, compared to best known results due to previous CMs.
△ Less
Submitted 22 December, 2022; v1 submitted 14 November, 2022;
originally announced November 2022.
-
Learning Tool Morphology for Contact-Rich Manipulation Tasks with Differentiable Simulation
Authors:
Mengxi Li,
Rika Antonova,
Dorsa Sadigh,
Jeannette Bohg
Abstract:
When humans perform contact-rich manipulation tasks, customized tools are often necessary to simplify the task. For instance, we use various utensils for handling food, such as knives, forks and spoons. Similarly, robots may benefit from specialized tools that enable them to more easily complete a variety of tasks. We present an end-to-end framework to automatically learn tool morphology for conta…
▽ More
When humans perform contact-rich manipulation tasks, customized tools are often necessary to simplify the task. For instance, we use various utensils for handling food, such as knives, forks and spoons. Similarly, robots may benefit from specialized tools that enable them to more easily complete a variety of tasks. We present an end-to-end framework to automatically learn tool morphology for contact-rich manipulation tasks by leveraging differentiable physics simulators. Previous work relied on manually constructed priors requiring detailed specification of a 3D object model, grasp pose and task description to facilitate the search or optimization process. Our approach only requires defining the objective with respect to task performance and enables learning a robust morphology through randomizing variations of the task. We make this optimization tractable by casting it as a continual learning problem. We demonstrate the effectiveness of our method for designing new tools in several scenarios, such as winding ropes, flipping a box and pushing peas onto a scoop in simulation. Additionally, experiments with real robots show that the tool shapes discovered by our method help them succeed in these scenarios.
△ Less
Submitted 25 February, 2023; v1 submitted 3 November, 2022;
originally announced November 2022.
-
In-Hand Manipulation of Unknown Objects with Tactile Sensing for Insertion
Authors:
Chaoyi Pan,
Marion Lepert,
Shenli Yuan,
Rika Antonova,
Jeannette Bohg
Abstract:
In this paper, we present a method to manipulate unknown objects in-hand using tactile sensing without relying on a known object model. In many cases, vision-only approaches may not be feasible; for example, due to occlusion in cluttered spaces. We address this limitation by introducing a method to reorient unknown objects using tactile sensing. It incrementally builds a probabilistic estimate of…
▽ More
In this paper, we present a method to manipulate unknown objects in-hand using tactile sensing without relying on a known object model. In many cases, vision-only approaches may not be feasible; for example, due to occlusion in cluttered spaces. We address this limitation by introducing a method to reorient unknown objects using tactile sensing. It incrementally builds a probabilistic estimate of the object shape and pose during task-driven manipulation. Our approach uses Bayesian optimization to balance exploration of the global object shape with efficient task completion. To demonstrate the effectiveness of our method, we apply it to a simulated Tactile-Enabled Roller Grasper, a gripper that rolls objects in hand while collecting tactile data. We evaluate our method on an insertion task with randomly generated objects and find that it reliably reorients objects while significantly reducing the exploration time.
△ Less
Submitted 10 March, 2023; v1 submitted 24 October, 2022;
originally announced October 2022.
-
Gather -- a better way to codehack online
Authors:
Rika Kobayashi,
Sarah Jaffa,
Jiachen Dong,
Roger D. Amos,
Jeremy Cohen,
Emily F. Kerrison
Abstract:
A virtual hands-on computer laboratory has been designed within the Gather online meeting platform. Gather's features such as spatial audio, private spaces and interactable objects offer scope for great improvements over currently used platforms, especially for small-group based teaching. We describe our experience using this virtual computer laboratory for a recent 'Python for Beginners' workshop…
▽ More
A virtual hands-on computer laboratory has been designed within the Gather online meeting platform. Gather's features such as spatial audio, private spaces and interactable objects offer scope for great improvements over currently used platforms, especially for small-group based teaching. We describe our experience using this virtual computer laboratory for a recent 'Python for Beginners' workshop held as part of the Software Sustainability Institute's 2022 Research Software Camp.
△ Less
Submitted 5 September, 2022;
originally announced September 2022.
-
Rethinking Optimization with Differentiable Simulation from a Global Perspective
Authors:
Rika Antonova,
Jingyun Yang,
Krishna Murthy Jatavallabhula,
Jeannette Bohg
Abstract:
Differentiable simulation is a promising toolkit for fast gradient-based policy optimization and system identification. However, existing approaches to differentiable simulation have largely tackled scenarios where obtaining smooth gradients has been relatively easy, such as systems with mostly smooth dynamics. In this work, we study the challenges that differentiable simulation presents when it i…
▽ More
Differentiable simulation is a promising toolkit for fast gradient-based policy optimization and system identification. However, existing approaches to differentiable simulation have largely tackled scenarios where obtaining smooth gradients has been relatively easy, such as systems with mostly smooth dynamics. In this work, we study the challenges that differentiable simulation presents when it is not feasible to expect that a single descent reaches a global optimum, which is often a problem in contact-rich scenarios. We analyze the optimization landscapes of diverse scenarios that contain both rigid bodies and deformable objects. In dynamic environments with highly deformable objects and fluids, differentiable simulators produce rugged landscapes with nonetheless useful gradients in some parts of the space. We propose a method that combines Bayesian optimization with semi-local 'leaps' to obtain a global search method that can use gradients effectively, while also maintaining robust performance in regions with noisy gradients. We show that our approach outperforms several gradient-based and gradient-free baselines on an extensive set of experiments in simulation, and also validate the method using experiments with a real robot and deformables. Videos and supplementary materials are available at https://tinyurl.com/globdiff
△ Less
Submitted 28 June, 2022;
originally announced July 2022.
-
DiffCloud: Real-to-Sim from Point Clouds with Differentiable Simulation and Rendering of Deformable Objects
Authors:
Priya Sundaresan,
Rika Antonova,
Jeannette Bohg
Abstract:
Research in manipulation of deformable objects is typically conducted on a limited range of scenarios, because handling each scenario on hardware takes significant effort. Realistic simulators with support for various types of deformations and interactions have the potential to speed up experimentation with novel tasks and algorithms. However, for highly deformable objects it is challenging to ali…
▽ More
Research in manipulation of deformable objects is typically conducted on a limited range of scenarios, because handling each scenario on hardware takes significant effort. Realistic simulators with support for various types of deformations and interactions have the potential to speed up experimentation with novel tasks and algorithms. However, for highly deformable objects it is challenging to align the output of a simulator with the behavior of real objects. Manual tuning is not intuitive, hence automated methods are needed. We view this alignment problem as a joint perception-inference challenge and demonstrate how to use recent neural network architectures to successfully perform simulation parameter inference from real point clouds. We analyze the performance of various architectures, comparing their data and training requirements. Furthermore, we propose to leverage differentiable point cloud sampling and differentiable simulation to significantly reduce the time to achieve the alignment. We employ an efficient way to propagate gradients from point clouds to simulated meshes and further through to the physical simulation parameters, such as mass and stiffness. Experiments with highly deformable objects show that our method can achieve comparable or better alignment with real object behavior, while reducing the time needed to achieve this by more than an order of magnitude. Videos and supplementary material are available at https://diffcloud.github.io.
△ Less
Submitted 13 May, 2025; v1 submitted 6 April, 2022;
originally announced April 2022.
-
TEN: Twin Embedding Networks for the Jigsaw Puzzle Problem with Eroded Boundaries
Authors:
Daniel Rika,
Dror Sholomon,
Eli David,
Nathan S. Netanyahu
Abstract:
This paper introduces the novel CNN-based encoder Twin Embedding Network (TEN), for the jigsaw puzzle problem (JPP), which represents a puzzle piece with respect to its boundary in a latent embedding space. Combining this latent representation with a simple distance measure, we demonstrate improved accuracy levels of our newly proposed pairwise compatibility measure (CM), compared to that of vario…
▽ More
This paper introduces the novel CNN-based encoder Twin Embedding Network (TEN), for the jigsaw puzzle problem (JPP), which represents a puzzle piece with respect to its boundary in a latent embedding space. Combining this latent representation with a simple distance measure, we demonstrate improved accuracy levels of our newly proposed pairwise compatibility measure (CM), compared to that of various classical methods, for degraded puzzles with eroded tile boundaries. We focus on this problem instance for our case study, as it serves as an appropriate testbed for real-world scenarios. Specifically, we demonstrated an improvement of up to 8.5% and 16.8% in reconstruction accuracy, for so-called Type-1 and Type-2 problem variants, respectively. Furthermore, we also demonstrated that TEN is faster by a few orders of magnitude, on average, than a typical deep neural network (NN) model, i.e., it is as fast as the classical methods. In this regard, the paper makes a significant first attempt at bridging the gap between the relatively low accuracy (of classical methods and the intensive computational complexity (of NN models), for practical, real-world puzzle-like problems.
△ Less
Submitted 7 November, 2022; v1 submitted 12 March, 2022;
originally announced March 2022.
-
A Bayesian Treatment of Real-to-Sim for Deformable Object Manipulation
Authors:
Rika Antonova,
Jingyun Yang,
Priya Sundaresan,
Dieter Fox,
Fabio Ramos,
Jeannette Bohg
Abstract:
Deformable object manipulation remains a challenging task in robotics research. Conventional techniques for parameter inference and state estimation typically rely on a precise definition of the state space and its dynamics. While this is appropriate for rigid objects and robot states, it is challenging to define the state space of a deformable object and how it evolves in time. In this work, we p…
▽ More
Deformable object manipulation remains a challenging task in robotics research. Conventional techniques for parameter inference and state estimation typically rely on a precise definition of the state space and its dynamics. While this is appropriate for rigid objects and robot states, it is challenging to define the state space of a deformable object and how it evolves in time. In this work, we pose the problem of inferring physical parameters of deformable objects as a probabilistic inference task defined with a simulator. We propose a novel methodology for extracting state information from image sequences via a technique to represent the state of a deformable object as a distribution embedding. This allows to incorporate noisy state observations directly into modern Bayesian simulation-based inference tools in a principled manner. Our experiments confirm that we can estimate posterior distributions of physical properties, such as elasticity, friction and scale of highly deformable objects, such as cloth and ropes. Overall, our method addresses the real-to-sim problem probabilistically and helps to better represent the evolution of the state of deformable objects.
△ Less
Submitted 9 December, 2021;
originally announced December 2021.
-
Learning Periodic Tasks from Human Demonstrations
Authors:
Jingyun Yang,
Junwu Zhang,
Connor Settle,
Akshara Rai,
Rika Antonova,
Jeannette Bohg
Abstract:
We develop a method for learning periodic tasks from visual demonstrations. The core idea is to leverage periodicity in the policy structure to model periodic aspects of the tasks. We use active learning to optimize parameters of rhythmic dynamic movement primitives (rDMPs) and propose an objective to maximize the similarity between the motion of objects manipulated by the robot and the desired mo…
▽ More
We develop a method for learning periodic tasks from visual demonstrations. The core idea is to leverage periodicity in the policy structure to model periodic aspects of the tasks. We use active learning to optimize parameters of rhythmic dynamic movement primitives (rDMPs) and propose an objective to maximize the similarity between the motion of objects manipulated by the robot and the desired motion in human video demonstrations. We consider tasks with deformable objects and granular matter whose states are challenging to represent and track: wiping surfaces with a cloth, winding cables/wires, and stirring granular matter with a spoon. Our method does not require tracking markers or manual annotations. The initial training data consists of 10-minute videos of random unpaired interactions with objects by the robot and human. We use these for unsupervised learning of a keypoint model to get task-agnostic visual correspondences. Then, we use Bayesian optimization to optimize rDMPs from a single human video demonstration within few robot trials. We present simulation and hardware experiments to validate our approach.
△ Less
Submitted 20 May, 2022; v1 submitted 28 September, 2021;
originally announced September 2021.
-
BayesSimIG: Scalable Parameter Inference for Adaptive Domain Randomization with IsaacGym
Authors:
Rika Antonova,
Fabio Ramos,
Rafael Possas,
Dieter Fox
Abstract:
BayesSim is a statistical technique for domain randomization in reinforcement learning based on likelihood-free inference of simulation parameters. This paper outlines BayesSimIG: a library that provides an implementation of BayesSim integrated with the recently released NVIDIA IsaacGym. This combination allows large-scale parameter inference with end-to-end GPU acceleration. Both inference and si…
▽ More
BayesSim is a statistical technique for domain randomization in reinforcement learning based on likelihood-free inference of simulation parameters. This paper outlines BayesSimIG: a library that provides an implementation of BayesSim integrated with the recently released NVIDIA IsaacGym. This combination allows large-scale parameter inference with end-to-end GPU acceleration. Both inference and simulation get GPU speedup, with support for running more than 10K parallel simulation environments for complex robotics tasks that can have more than 100 simulation parameters to estimate. BayesSimIG provides an integration with TensorBoard to easily visualize slices of high-dimensional posteriors. The library is built in a modular way to support research experiments with novel ways to collect and process the trajectories from the parallel IsaacGym environments.
△ Less
Submitted 9 July, 2021;
originally announced July 2021.
-
Sequential Topological Representations for Predictive Models of Deformable Objects
Authors:
Rika Antonova,
Anastasiia Varava,
Peiyang Shi,
J. Frederico Carvalho,
Danica Kragic
Abstract:
Deformable objects present a formidable challenge for robotic manipulation due to the lack of canonical low-dimensional representations and the difficulty of capturing, predicting, and controlling such objects. We construct compact topological representations to capture the state of highly deformable objects that are topologically nontrivial. We develop an approach that tracks the evolution of thi…
▽ More
Deformable objects present a formidable challenge for robotic manipulation due to the lack of canonical low-dimensional representations and the difficulty of capturing, predicting, and controlling such objects. We construct compact topological representations to capture the state of highly deformable objects that are topologically nontrivial. We develop an approach that tracks the evolution of this topological state through time. Under several mild assumptions, we prove that the topology of the scene and its evolution can be recovered from point clouds representing the scene. Our further contribution is a method to learn predictive models that take a sequence of past point cloud observations as input and predict a sequence of topological states, conditioned on target/future control actions. Our experiments with highly deformable objects in simulation show that the proposed multistep predictive models yield more precise results than those obtained from computational topology libraries. These models can leverage patterns inferred across various objects and offer fast multistep predictions suitable for real-time applications.
△ Less
Submitted 10 May, 2021; v1 submitted 23 November, 2020;
originally announced November 2020.
-
Analytic Manifold Learning: Unifying and Evaluating Representations for Continuous Control
Authors:
Rika Antonova,
Maksim Maydanskiy,
Danica Kragic,
Sam Devlin,
Katja Hofmann
Abstract:
We address the problem of learning reusable state representations from streaming high-dimensional observations. This is important for areas like Reinforcement Learning (RL), which yields non-stationary data distributions during training. We make two key contributions. First, we propose an evaluation suite that measures alignment between latent and true low-dimensional states. We benchmark several…
▽ More
We address the problem of learning reusable state representations from streaming high-dimensional observations. This is important for areas like Reinforcement Learning (RL), which yields non-stationary data distributions during training. We make two key contributions. First, we propose an evaluation suite that measures alignment between latent and true low-dimensional states. We benchmark several widely used unsupervised learning approaches. This uncovers the strengths and limitations of existing approaches that impose additional constraints/objectives on the latent space. Our second contribution is a unifying mathematical formulation for learning latent relations. We learn analytic relations on source domains, then use these relations to help structure the latent space when learning on target domains. This formulation enables a more general, flexible and principled way of shaping the latent space. It formalizes the notion of learning independent relations, without imposing restrictive simplifying assumptions or requiring domain-specific information. We present mathematical properties, concrete algorithms for implementation and experimental validation of successful learning and transfer of latent relations.
△ Less
Submitted 6 October, 2020; v1 submitted 15 June, 2020;
originally announced June 2020.
-
Faster Algorithms for Orienteering and $k$-TSP
Authors:
Lee-Ad Gottlieb,
Robert Krauthgamer,
Havana Rika
Abstract:
We consider the rooted orienteering problem in Euclidean space: Given $n$ points $P$ in $\mathbb R^d$, a root point $s\in P$ and a budget $\mathcal B>0$, find a path that starts from $s$, has total length at most $\mathcal B$, and visits as many points of $P$ as possible. This problem is known to be NP-hard, hence we study $(1-δ)$-approximation algorithms. The previous Polynomial-Time Approximatio…
▽ More
We consider the rooted orienteering problem in Euclidean space: Given $n$ points $P$ in $\mathbb R^d$, a root point $s\in P$ and a budget $\mathcal B>0$, find a path that starts from $s$, has total length at most $\mathcal B$, and visits as many points of $P$ as possible. This problem is known to be NP-hard, hence we study $(1-δ)$-approximation algorithms. The previous Polynomial-Time Approximation Scheme (PTAS) for this problem, due to Chen and Har-Peled (2008), runs in time $n^{O(d\sqrt{d}/δ)}(\log n)^{(d/δ)^{O(d)}}$, and improving on this time bound was left as an open problem. Our main contribution is a PTAS with a significantly improved time complexity of $n^{O(1/δ)}(\log n)^{(d/δ)^{O(d)}}$.
A known technique for approximating the orienteering problem is to reduce it to solving $1/δ$ correlated instances of rooted $k$-TSP (a $k$-TSP tour is one that visits at least $k$ points). However, the $k$-TSP tours in this reduction must achieve a certain excess guarantee (namely, their length can surpass the optimum length only in proportion to a parameter of the optimum called excess) that is stronger than the usual $(1+δ)$-approximation. Our main technical contribution is to improve the running time of these $k$-TSP variants, particularly in its dependence on the dimension $d$. Indeed, our running time is polynomial even for a moderately large dimension, roughly up to $d=O(\log\log n)$ instead of $d=O(1)$.
△ Less
Submitted 21 April, 2022; v1 submitted 18 February, 2020;
originally announced February 2020.
-
A Novel Hybrid Scheme Using Genetic Algorithms and Deep Learning for the Reconstruction of Portuguese Tile Panels
Authors:
Daniel Rika,
Dror Sholomon,
Eli David,
Nathan S. Netanyahu
Abstract:
This paper presents a novel scheme, based on a unique combination of genetic algorithms (GAs) and deep learning (DL), for the automatic reconstruction of Portuguese tile panels, a challenging real-world variant of the jigsaw puzzle problem (JPP) with important national heritage implications. Specifically, we introduce an enhanced GA-based puzzle solver, whose integration with a novel DL-based comp…
▽ More
This paper presents a novel scheme, based on a unique combination of genetic algorithms (GAs) and deep learning (DL), for the automatic reconstruction of Portuguese tile panels, a challenging real-world variant of the jigsaw puzzle problem (JPP) with important national heritage implications. Specifically, we introduce an enhanced GA-based puzzle solver, whose integration with a novel DL-based compatibility measure (DLCM) yields state-of-the-art performance, regarding the above application. Current compatibility measures consider typically (the chromatic information of) edge pixels (between adjacent tiles), and help achieve high accuracy for the synthetic JPP variant. However, such measures exhibit rather poor performance when applied to the Portuguese tile panels, which are susceptible to various real-world effects, e.g., monochromatic panels, non-squared tiles, edge degradation, etc. To overcome such difficulties, we have developed a novel DLCM to extract high-level texture/color statistics from the entire tile information.
Integrating this measure with our enhanced GA-based puzzle solver, we have demonstrated, for the first time, how to deal most effectively with large-scale real-world problems, such as the Portuguese tile problem. Specifically, we have achieved 82% accuracy for the reconstruction of Portuguese tile panels with unknown piece rotation and puzzle dimension (compared to merely 3.5% average accuracy achieved by the best method known for solving this problem variant). The proposed method outperforms even human experts in several cases, correcting their mistakes in the manual tile assembly.
△ Less
Submitted 4 December, 2019;
originally announced December 2019.
-
Towards Low-Latency High-Bandwidth Control of Quadrotors using Event Cameras
Authors:
Rika Sugimoto Dimitrova,
Mathias Gehrig,
Dario Brescianini,
Davide Scaramuzza
Abstract:
Event cameras are a promising candidate to enable high speed vision-based control due to their low sensor latency and high temporal resolution. However, purely event-based feedback has yet to be used in the control of drones. In this work, a first step towards implementing low-latency high-bandwidth control of quadrotors using event cameras is taken. In particular, this paper addresses the problem…
▽ More
Event cameras are a promising candidate to enable high speed vision-based control due to their low sensor latency and high temporal resolution. However, purely event-based feedback has yet to be used in the control of drones. In this work, a first step towards implementing low-latency high-bandwidth control of quadrotors using event cameras is taken. In particular, this paper addresses the problem of one-dimensional attitude tracking using a dualcopter platform equipped with an event camera. The event-based state estimation consists of a modified Hough transform algorithm combined with a Kalman filter that outputs the roll angle and angular velocity of the dualcopter relative to a horizon marked by a black-and-white disk. The estimated state is processed by a proportional-derivative attitude control law that computes the rotor thrusts required to track the desired attitude. The proposed attitude tracking scheme shows promising results of event-camera-driven closed loop control: the state estimator performs with an update rate of 1 kHz and a latency determined to be 12 ms, enabling attitude tracking at speeds of over 1600 deg/s.
△ Less
Submitted 28 March, 2020; v1 submitted 11 November, 2019;
originally announced November 2019.
-
Bayesian Optimization in Variational Latent Spaces with Dynamic Compression
Authors:
Rika Antonova,
Akshara Rai,
Tianyu Li,
Danica Kragic
Abstract:
Data-efficiency is crucial for autonomous robots to adapt to new tasks and environments. In this work we focus on robotics problems with a budget of only 10-20 trials. This is a very challenging setting even for data-efficient approaches like Bayesian optimization (BO), especially when optimizing higher-dimensional controllers. Simulated trajectories can be used to construct informed kernels for B…
▽ More
Data-efficiency is crucial for autonomous robots to adapt to new tasks and environments. In this work we focus on robotics problems with a budget of only 10-20 trials. This is a very challenging setting even for data-efficient approaches like Bayesian optimization (BO), especially when optimizing higher-dimensional controllers. Simulated trajectories can be used to construct informed kernels for BO. However, previous work employed supervised ways of extracting low-dimensional features for these. We propose a model and architecture for a sequential variational autoencoder that embeds the space of simulated trajectories into a lower-dimensional space of latent paths in an unsupervised way. We further compress the search space for BO by reducing exploration in parts of the state space that are undesirable, without requiring explicit constraints on controller parameters. We validate our approach with hardware experiments on a Daisy hexapod robot and an ABB Yumi manipulator. We also present simulation experiments with further comparisons to several baselines on Daisy and two manipulators. Our experiments indicate the proposed trajectory-based kernel with dynamic compression can offer ultra data-efficient optimization.
△ Less
Submitted 10 July, 2019;
originally announced July 2019.
-
Flow-Cut Gaps and Face Covers in Planar Graphs
Authors:
Robert Krauthgamer,
James R. Lee,
Havana Rika
Abstract:
The relationship between the sparsest cut and the maximum concurrent multi-flow in graphs has been studied extensively. For general graphs with $k$ terminal pairs, the flow-cut gap is $O(\log k)$, and this is tight. But when topological restrictions are placed on the flow network, the situation is far less clear. In particular, it has been conjectured that the flow-cut gap in planar networks is…
▽ More
The relationship between the sparsest cut and the maximum concurrent multi-flow in graphs has been studied extensively. For general graphs with $k$ terminal pairs, the flow-cut gap is $O(\log k)$, and this is tight. But when topological restrictions are placed on the flow network, the situation is far less clear. In particular, it has been conjectured that the flow-cut gap in planar networks is $O(1)$, while the known bounds place the gap somewhere between $2$ (Lee and Raghavendra, 2003) and $O(\sqrt{\log k})$ (Rao, 1999).
A seminal result of Okamura and Seymour (1981) shows that when all the terminals of a planar network lie on a single face, the flow-cut gap is exactly $1$. This setting can be generalized by considering planar networks where the terminals lie on $γ>1$ faces in some fixed planar drawing. Lee and Sidiropoulos (2009) proved that the flow-cut gap is bounded by a function of $γ$, and Chekuri, Shepherd, and Weibel (2013) showed that the gap is at most $3γ$. We prove that the flow-cut gap is $O(\logγ)$, by showing that the edge-weighted shortest-path metric induced on the terminals admits a stochastic embedding into trees with distortion $O(\logγ)$, which is tight.
The preceding results refer to the setting of edge-capacitated networks. For vertex-capacitated networks, it can be significantly more challenging to control flow-cut gaps. While there is no exact vertex-capacitated version of the Okamura-Seymour Theorem, an approximate version holds; Lee, Mendel, and Moharrami (2015) showed that the vertex-capacitated flow-cut gap is $O(1)$ on planar networks whose terminals lie on a single face. We prove that the flow-cut gap is $O(γ)$ for vertex-capacitated instances when the terminals lie on at most $γ$ faces. In fact, this result holds in the more general setting of submodular vertex capacities.
△ Less
Submitted 6 November, 2018;
originally announced November 2018.
-
Global Search with Bernoulli Alternation Kernel for Task-oriented Grasping Informed by Simulation
Authors:
Rika Antonova,
Mia Kokic,
Johannes A. Stork,
Danica Kragic
Abstract:
We develop an approach that benefits from large simulated datasets and takes full advantage of the limited online data that is most relevant. We propose a variant of Bayesian optimization that alternates between using informed and uninformed kernels. With this Bernoulli Alternation Kernel we ensure that discrepancies between simulation and reality do not hinder adapting robot control policies onli…
▽ More
We develop an approach that benefits from large simulated datasets and takes full advantage of the limited online data that is most relevant. We propose a variant of Bayesian optimization that alternates between using informed and uninformed kernels. With this Bernoulli Alternation Kernel we ensure that discrepancies between simulation and reality do not hinder adapting robot control policies online. The proposed approach is applied to a challenging real-world problem of task-oriented grasping with novel objects. Our further contribution is a neural network architecture and training pipeline that use experience from grasping objects in simulation to learn grasp stability scores. We learn task scores from a labeled dataset with a convolutional network, which is used to construct an informed kernel for our variant of Bayesian optimization. Experiments on an ABB Yumi robot with real sensor data demonstrate success of our approach, despite the challenge of fulfilling task requirements and high uncertainty over physical properties of objects.
△ Less
Submitted 10 October, 2018;
originally announced October 2018.
-
Using Simulation to Improve Sample-Efficiency of Bayesian Optimization for Bipedal Robots
Authors:
Akshara Rai,
Rika Antonova,
Franziska Meier,
Christopher G. Atkeson
Abstract:
Learning for control can acquire controllers for novel robotic tasks, paving the path for autonomous agents. Such controllers can be expert-designed policies, which typically require tuning of parameters for each task scenario. In this context, Bayesian optimization (BO) has emerged as a promising approach for automatically tuning controllers. However, when performing BO on hardware for high-dimen…
▽ More
Learning for control can acquire controllers for novel robotic tasks, paving the path for autonomous agents. Such controllers can be expert-designed policies, which typically require tuning of parameters for each task scenario. In this context, Bayesian optimization (BO) has emerged as a promising approach for automatically tuning controllers. However, when performing BO on hardware for high-dimensional policies, sample-efficiency can be an issue. Here, we develop an approach that utilizes simulation to map the original parameter space into a domain-informed space. During BO, similarity between controllers is now calculated in this transformed space. Experiments on the ATRIAS robot hardware and another bipedal robot simulation show that our approach succeeds at sample-efficiently learning controllers for multiple robots. Another question arises: What if the simulation significantly differs from hardware? To answer this, we create increasingly approximate simulators and study the effect of increasing simulation-hardware mismatch on the performance of Bayesian optimization. We also compare our approach to other approaches from literature, and find it to be more reliable, especially in cases of high mismatch. Our experiments show that our approach succeeds across different controller types, bipedal robot models and simulator fidelity levels, making it applicable to a wide range of bipedal locomotion problems.
△ Less
Submitted 7 May, 2018;
originally announced May 2018.
-
Bayesian Optimization Using Domain Knowledge on the ATRIAS Biped
Authors:
Akshara Rai,
Rika Antonova,
Seungmoon Song,
William Martin,
Hartmut Geyer,
Christopher G. Atkeson
Abstract:
Controllers in robotics often consist of expert-designed heuristics, which can be hard to tune in higher dimensions. It is typical to use simulation to learn these parameters, but controllers learned in simulation often don't transfer to hardware. This necessitates optimization directly on hardware. However, collecting data on hardware can be expensive. This has led to a recent interest in adaptin…
▽ More
Controllers in robotics often consist of expert-designed heuristics, which can be hard to tune in higher dimensions. It is typical to use simulation to learn these parameters, but controllers learned in simulation often don't transfer to hardware. This necessitates optimization directly on hardware. However, collecting data on hardware can be expensive. This has led to a recent interest in adapting data-efficient learning techniques to robotics. One popular method is Bayesian Optimization (BO), a sample-efficient black-box optimization scheme, but its performance typically degrades in higher dimensions. We aim to overcome this problem by incorporating domain knowledge to reduce dimensionality in a meaningful way, with a focus on bipedal locomotion. In previous work, we proposed a transformation based on knowledge of human walking that projected a 16-dimensional controller to a 1-dimensional space. In simulation, this showed enhanced sample efficiency when optimizing human-inspired neuromuscular walking controllers on a humanoid model. In this paper, we present a generalized feature transform applicable to non-humanoid robot morphologies and evaluate it on the ATRIAS bipedal robot -- in simulation and on hardware. We present three different walking controllers; two are evaluated on the real robot. Our results show that this feature transform captures important aspects of walking and accelerates learning on hardware and simulation, as compared to traditional BO.
△ Less
Submitted 18 September, 2017;
originally announced September 2017.
-
Deep Kernels for Optimizing Locomotion Controllers
Authors:
Rika Antonova,
Akshara Rai,
Christopher G. Atkeson
Abstract:
Sample efficiency is important when optimizing parameters of locomotion controllers, since hardware experiments are time consuming and expensive. Bayesian Optimization, a sample-efficient optimization framework, has recently been widely applied to address this problem, but further improvements in sample efficiency are needed for practical applicability to real-world robots and high-dimensional con…
▽ More
Sample efficiency is important when optimizing parameters of locomotion controllers, since hardware experiments are time consuming and expensive. Bayesian Optimization, a sample-efficient optimization framework, has recently been widely applied to address this problem, but further improvements in sample efficiency are needed for practical applicability to real-world robots and high-dimensional controllers. To address this, prior work has proposed using domain expertise for constructing custom distance metrics for locomotion. In this work we show how to learn such a distance metric automatically. We use a neural network to learn an informed distance metric from data obtained in high-fidelity simulations. We conduct experiments on two different controllers and robot architectures. First, we demonstrate improvement in sample efficiency when optimizing a 5-dimensional controller on the ATRIAS robot hardware. We then conduct simulation experiments to optimize a 16-dimensional controller for a 7-link robot model and obtain significant improvements even when optimizing in perturbed environments. This demonstrates that our approach is able to enhance sample efficiency for two different controllers, hence is a fitting candidate for further experiments on hardware in the future.
△ Less
Submitted 8 November, 2017; v1 submitted 27 July, 2017;
originally announced July 2017.
-
Unlocking the Potential of Simulators: Design with RL in Mind
Authors:
Rika Antonova,
Silvia Cruciani
Abstract:
Using Reinforcement Learning (RL) in simulation to construct policies useful in real life is challenging. This is often attributed to the sequential decision making aspect: inaccuracies in simulation accumulate over multiple steps, hence the simulated trajectories diverge from what would happen in reality.
In our work we show the need to consider another important aspect: the mismatch in simulat…
▽ More
Using Reinforcement Learning (RL) in simulation to construct policies useful in real life is challenging. This is often attributed to the sequential decision making aspect: inaccuracies in simulation accumulate over multiple steps, hence the simulated trajectories diverge from what would happen in reality.
In our work we show the need to consider another important aspect: the mismatch in simulating control. We bring attention to the need for modeling control as well as dynamics, since oversimplifying assumptions about applying actions of RL policies could make the policies fail on real-world systems.
We design a simulator for solving a pivoting task (of interest in Robotics) and demonstrate that even a simple simulator designed with RL in mind outperforms high-fidelity simulators when it comes to learning a policy that is to be deployed on a real robotic system. We show that a phenomenon that is hard to model - friction - could be exploited successfully, even when RL is performed using a simulator with a simple dynamics and noise model. Hence, we demonstrate that as long as the main sources of uncertainty are identified, it could be possible to learn policies applicable to real systems even using a simple simulator.
RL-compatible simulators could open the possibilities for applying a wide range of RL algorithms in various fields. This is important, since currently data sparsity in fields like healthcare and education frequently forces researchers and engineers to only consider sample-efficient RL approaches. Successful simulator-aided RL could increase flexibility of experimenting with RL algorithms and help applying RL policies to real-world settings in fields where data is scarce. We believe that lessons learned in Robotics could help other fields design RL-compatible simulators, so we summarize our experience and conclude with suggestions.
△ Less
Submitted 8 June, 2017;
originally announced June 2017.
-
Reinforcement Learning for Pivoting Task
Authors:
Rika Antonova,
Silvia Cruciani,
Christian Smith,
Danica Kragic
Abstract:
In this work we propose an approach to learn a robust policy for solving the pivoting task. Recently, several model-free continuous control algorithms were shown to learn successful policies without prior knowledge of the dynamics of the task. However, obtaining successful policies required thousands to millions of training episodes, limiting the applicability of these approaches to real hardware.…
▽ More
In this work we propose an approach to learn a robust policy for solving the pivoting task. Recently, several model-free continuous control algorithms were shown to learn successful policies without prior knowledge of the dynamics of the task. However, obtaining successful policies required thousands to millions of training episodes, limiting the applicability of these approaches to real hardware. We developed a training procedure that allows us to use a simple custom simulator to learn policies robust to the mismatch of simulation vs robot. In our experiments, we demonstrate that the policy learned in the simulator is able to pivot the object to the desired target angle on the real robot. We also show generalization to an object with different inertia, shape, mass and friction properties than those used during training. This result is a step towards making model-free reinforcement learning available for solving robotics tasks via pre-training in simulators that offer only an imprecise match to the real-world dynamics.
△ Less
Submitted 1 March, 2017;
originally announced March 2017.
-
Refined Vertex Sparsifiers of Planar Graphs
Authors:
Robert Krauthgamer,
Havana,
Rika
Abstract:
We study the following version of cut sparsification. Given a large edge-weighted network $G$ with $k$ terminal vertices, compress it into a smaller network $H$ with the same terminals, such that every minimum terminal cut in $H$ approximates the corresponding one in $G$, up to a factor $q\geq 1$ that is called the quality. (The case $q=1$ is known also as a mimicking network). We provide new insi…
▽ More
We study the following version of cut sparsification. Given a large edge-weighted network $G$ with $k$ terminal vertices, compress it into a smaller network $H$ with the same terminals, such that every minimum terminal cut in $H$ approximates the corresponding one in $G$, up to a factor $q\geq 1$ that is called the quality. (The case $q=1$ is known also as a mimicking network). We provide new insights about the structure of minimum terminal cuts, leading to new results for cut sparsifiers of planar graphs. Our first contribution identifies a subset of the minimum terminal cuts, which we call elementary, that generates all the others. Consequently, $H$ is a cut sparsifier if and only if it preserves all the elementary terminal cuts (up to this factor $q$). This structural characterization lead to improved bounds on the size of $H$. For example, it improve the bound of mimicking-network size for planar graphs into a near-optimal one. Our second and main contribution is to refine the known bounds in terms of $γ=γ(G)$, which is defined as the minimum number of faces that are incident to all the terminals in a planar graph $G$. We prove that the number of elementary terminal cuts is $O((2k/γ)^{2γ})$ (compared to $O(2^k)$ terminal cuts), and furthermore obtain a mimicking-network of size $O(γ2^{2γ} k^4)$, which is near-optimal as a function of $γ$. In the analysis we break the elementary terminal cuts into fragments, and count them carefully. Our third contribution is a duality between cut sparsification and distance sparsification for certain planar graphs, when the sparsifier $H$ is required to be a minor of $G$. This duality connects problems that were previously studied separately, implying new results, new proofs of known results, and equivalences between open gaps.
△ Less
Submitted 4 October, 2019; v1 submitted 20 February, 2017;
originally announced February 2017.
-
Sample Efficient Optimization for Learning Controllers for Bipedal Locomotion
Authors:
Rika Antonova,
Akshara Rai,
Christopher G. Atkeson
Abstract:
Learning policies for bipedal locomotion can be difficult, as experiments are expensive and simulation does not usually transfer well to hardware. To counter this, we need al- gorithms that are sample efficient and inherently safe. Bayesian Optimization is a powerful sample-efficient tool for optimizing non-convex black-box functions. However, its performance can degrade in higher dimensions. We d…
▽ More
Learning policies for bipedal locomotion can be difficult, as experiments are expensive and simulation does not usually transfer well to hardware. To counter this, we need al- gorithms that are sample efficient and inherently safe. Bayesian Optimization is a powerful sample-efficient tool for optimizing non-convex black-box functions. However, its performance can degrade in higher dimensions. We develop a distance metric for bipedal locomotion that enhances the sample-efficiency of Bayesian Optimization and use it to train a 16 dimensional neuromuscular model for planar walking. This distance metric reflects some basic gait features of healthy walking and helps us quickly eliminate a majority of unstable controllers. With our approach we can learn policies for walking in less than 100 trials for a range of challenging settings. In simulation, we show results on two different costs and on various terrains including rough ground and ramps, sloping upwards and downwards. We also perturb our models with unknown inertial disturbances analogous with differences between simulation and hardware. These results are promising, as they indicate that this method can potentially be used to learn control policies on hardware.
△ Less
Submitted 15 October, 2016;
originally announced October 2016.
-
Moving robots efficiently using the combinatorics of CAT(0) cubical complexes
Authors:
Federico Ardila,
Tia Baker,
Rika Yatchak
Abstract:
Given a reconfigurable system X, such as a robot moving on a grid or a set of particles traversing a graph without colliding, the possible positions of X naturally form a cubical complex S(X). When S(X) is a CAT(0) space, we can explicitly construct the shortest path between any two points, for any of the four most natural metrics: distance, time, number of moves, and number of steps of simultaneo…
▽ More
Given a reconfigurable system X, such as a robot moving on a grid or a set of particles traversing a graph without colliding, the possible positions of X naturally form a cubical complex S(X). When S(X) is a CAT(0) space, we can explicitly construct the shortest path between any two points, for any of the four most natural metrics: distance, time, number of moves, and number of steps of simultaneous moves.
CAT(0) cubical complexes are in correspondence with posets with inconsistent pairs (PIPs), so we can prove that a state complex S(X) is CAT(0) by identifying the corresponding PIP. We illustrate this very general strategy with one known and one new example: Abrams and Ghrist's positive robotic arm on a square grid, and the robotic arm in a strip. We then use the PIP as a combinatorial "remote control" to move these robots efficiently from one position to another.
△ Less
Submitted 27 August, 2014; v1 submitted 6 November, 2012;
originally announced November 2012.
-
Mimicking Networks and Succinct Representations of Terminal Cuts
Authors:
Robert Krauthgamer,
Inbal Rika
Abstract:
Given a large edge-weighted network $G$ with $k$ terminal vertices, we wish to compress it and store, using little memory, the value of the minimum cut (or equivalently, maximum flow) between every bipartition of terminals. One appealing methodology to implement a compression of $G$ is to construct a \emph{mimicking network}: a small network $G'$ with the same $k$ terminals, in which the minimum c…
▽ More
Given a large edge-weighted network $G$ with $k$ terminal vertices, we wish to compress it and store, using little memory, the value of the minimum cut (or equivalently, maximum flow) between every bipartition of terminals. One appealing methodology to implement a compression of $G$ is to construct a \emph{mimicking network}: a small network $G'$ with the same $k$ terminals, in which the minimum cut value between every bipartition of terminals is the same as in $G$. This notion was introduced by Hagerup, Katajainen, Nishimura, and Ragde [JCSS '98], who proved that such $G'$ of size at most $2^{2^k}$ always exists. Obviously, by having access to the smaller network $G'$, certain computations involving cuts can be carried out much more efficiently.
We provide several new bounds, which together narrow the previously known gap from doubly-exponential to only singly-exponential, both for planar and for general graphs. Our first and main result is that every $k$-terminal planar network admits a mimicking network $G'$ of size $O(k^2 2^{2k})$, which is moreover a minor of $G$. On the other hand, some planar networks $G$ require $|E(G')| \ge Ω(k^2)$. For general networks, we show that certain bipartite graphs only admit mimicking networks of size $|V(G')| \geq 2^{Ω(k)}$, and moreover, every data structure that stores the minimum cut value between all bipartitions of the terminals must use $2^{Ω(k)}$ machine words.
△ Less
Submitted 26 July, 2012;
originally announced July 2012.