Skip to main content

Showing 1–19 of 19 results for author: Biza, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.19989  [pdf, other

    cs.RO cs.LG

    On-Robot Reinforcement Learning with Goal-Contrastive Rewards

    Authors: Ondrej Biza, Thomas Weng, Lingfeng Sun, Karl Schmeckpeper, Tarik Kelestemur, Yecheng Jason Ma, Robert Platt, Jan-Willem van de Meent, Lawson L. S. Wong

    Abstract: Reinforcement Learning (RL) has the potential to enable robots to learn from their own actions in the real world. Unfortunately, RL can be prohibitively expensive, in terms of on-robot runtime, due to inefficient exploration when learning from a sparse reward signal. Designing dense reward functions is labour-intensive and requires domain expertise. In our work, we propose GCR (Goal-Contrastive Re… ▽ More

    Submitted 14 May, 2025; v1 submitted 25 October, 2024; originally announced October 2024.

  2. arXiv:2407.11298  [pdf, other

    cs.RO

    ThinkGrasp: A Vision-Language System for Strategic Part Grasping in Clutter

    Authors: Yaoyao Qian, Xupeng Zhu, Ondrej Biza, Shuo Jiang, Linfeng Zhao, Haojie Huang, Yu Qi, Robert Platt

    Abstract: Robotic grasping in cluttered environments remains a significant challenge due to occlusions and complex object arrangements. We have developed ThinkGrasp, a plug-and-play vision-language grasping system that makes use of GPT-4o's advanced contextual reasoning for heavy clutter environment grasping strategies. ThinkGrasp can effectively identify and generate grasp poses for target objects, even wh… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Project Website:(https://h-freax.github.io/thinkgrasp_page/)

  3. arXiv:2406.13961  [pdf, other

    cs.LG cs.RO

    Equivariant Offline Reinforcement Learning

    Authors: Arsh Tangri, Ondrej Biza, Dian Wang, David Klee, Owen Howell, Robert Platt

    Abstract: Sample efficiency is critical when applying learning-based methods to robotic manipulation due to the high cost of collecting expert demonstrations and the challenges of on-robot policy learning through online Reinforcement Learning (RL). Offline RL addresses this issue by enabling policy learning from an offline dataset collected using any behavioral policy, regardless of its quality. However, re… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  4. arXiv:2406.11740  [pdf, other

    cs.RO cs.AI cs.LG

    Imagination Policy: Using Generative Point Cloud Models for Learning Manipulation Policies

    Authors: Haojie Huang, Karl Schmeckpeper, Dian Wang, Ondrej Biza, Yaoyao Qian, Haotian Liu, Mingxi Jia, Robert Platt, Robin Walters

    Abstract: Humans can imagine goal states during planning and perform actions to match those goals. In this work, we propose Imagination Policy, a novel multi-task key-frame policy network for solving high-precision pick and place tasks. Instead of learning actions directly, Imagination Policy generates point clouds to imagine desired states which are then translated to actions using rigid action estimation.… ▽ More

    Submitted 30 November, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  5. arXiv:2307.03704  [pdf, other

    cs.CV cs.LG math.GR

    Equivariant Single View Pose Prediction Via Induced and Restricted Representations

    Authors: Owen Howell, David Klee, Ondrej Biza, Linfeng Zhao, Robin Walters

    Abstract: Learning about the three-dimensional world from two-dimensional images is a fundamental problem in computer vision. An ideal neural network architecture for such tasks would leverage the fact that objects can be rotated and translated in three dimensions to make predictions about novel images. However, imposing SO(3)-equivariance on two-dimensional inputs is difficult because the group of three-di… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  6. arXiv:2306.12392  [pdf, other

    cs.RO cs.LG

    One-shot Imitation Learning via Interaction Warping

    Authors: Ondrej Biza, Skye Thompson, Kishore Reddy Pagidi, Abhinav Kumar, Elise van der Pol, Robin Walters, Thomas Kipf, Jan-Willem van de Meent, Lawson L. S. Wong, Robert Platt

    Abstract: Imitation learning of robot policies from few demonstrations is crucial in open-ended applications. We propose a new method, Interaction Warping, for learning SE(3) robotic manipulation policies from a single demonstration. We infer the 3D mesh of each object in the environment using shape warping, a technique for aligning point clouds across object instances. Then, we represent manipulation actio… ▽ More

    Submitted 4 November, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: CoRL 2023

  7. arXiv:2306.06489  [pdf, other

    cs.RO cs.AI

    On Robot Grasp Learning Using Equivariant Models

    Authors: Xupeng Zhu, Dian Wang, Guanang Su, Ondrej Biza, Robin Walters, Robert Platt

    Abstract: Real-world grasp detection is challenging due to the stochasticity in grasp dynamics and the noise in hardware. Ideally, the system would adapt to the real world by training directly on physical systems. However, this is generally difficult due to the large amount of training data required by most grasp learning models. In this paper, we note that the planar grasp function is $\SE(2)$-equivariant… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

    Comments: Accepted in Autonomous Robot. arXiv admin note: substantial text overlap with arXiv:2202.09468

  8. arXiv:2302.13926  [pdf, other

    cs.CV

    Image to Sphere: Learning Equivariant Features for Efficient Pose Prediction

    Authors: David M. Klee, Ondrej Biza, Robert Platt, Robin Walters

    Abstract: Predicting the pose of objects from a single image is an important but difficult computer vision problem. Methods that predict a single point estimate do not predict the pose of objects with symmetries well and cannot represent uncertainty. Alternatively, some works predict a distribution over orientations in $\mathrm{SO}(3)$. However, training such models can be computation- and sample-inefficien… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  9. arXiv:2302.04973  [pdf, other

    cs.CV cs.AI cs.LG

    Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames

    Authors: Ondrej Biza, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Gamaleldin F. Elsayed, Aravindh Mahendran, Thomas Kipf

    Abstract: Automatically discovering composable abstractions from raw perceptual data is a long-standing challenge in machine learning. Recent slot-based neural networks that learn about objects in a self-supervised manner have made exciting progress in this direction. However, they typically fall short at adequately capturing spatial symmetries present in the visual world, which leads to sample inefficiency… ▽ More

    Submitted 20 July, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: Accepted at ICML 2023. Project page: https://invariantsa.github.io/

  10. arXiv:2207.11313  [pdf, other

    cs.RO

    Graph-Structured Policy Learning for Multi-Goal Manipulation Tasks

    Authors: David Klee, Ondrej Biza, Robert Platt

    Abstract: Multi-goal policy learning for robotic manipulation is challenging. Prior successes have used state-based representations of the objects or provided demonstration data to facilitate learning. In this paper, by hand-coding a high-level discrete representation of the domain, we show that policies to reach dozens of goals can be learned with a single network using Q-learning from pixels. The agent fo… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

  11. arXiv:2207.08925  [pdf, other

    cs.CV cs.LG

    Image to Icosahedral Projection for $\mathrm{SO}(3)$ Object Reasoning from Single-View Images

    Authors: David Klee, Ondrej Biza, Robert Platt, Robin Walters

    Abstract: Reasoning about 3D objects based on 2D images is challenging due to variations in appearance caused by viewing the object from different orientations. Tasks such as object classification are invariant to 3D rotations and other such as pose estimation are equivariant. However, imposing equivariance as a model constraint is typically not possible with 2D image input because we do not have an a prior… ▽ More

    Submitted 15 November, 2022; v1 submitted 18 July, 2022; originally announced July 2022.

  12. arXiv:2204.13022  [pdf, other

    cs.LG

    Binding Actions to Objects in World Models

    Authors: Ondrej Biza, Robert Platt, Jan-Willem van de Meent, Lawson L. S. Wong, Thomas Kipf

    Abstract: We study the problem of binding actions to objects in object-factored world models using action-attention mechanisms. We propose two attention mechanisms for binding actions to objects, soft attention and hard attention, which we evaluate in the context of structured world models for five environments. Our experiments show that hard attention helps contrastively-trained structured world models to… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: Published at the ICLR 2022 workshop on Objects, Structure and Causality

  13. arXiv:2204.11371  [pdf, other

    cs.LG

    Learning Symmetric Embeddings for Equivariant World Models

    Authors: Jung Yeon Park, Ondrej Biza, Linfeng Zhao, Jan Willem van de Meent, Robin Walters

    Abstract: Incorporating symmetries can lead to highly data-efficient and generalizable models by defining equivalence classes of data samples related by transformations. However, characterizing how transformations act on input data is often difficult, limiting the applicability of equivariant models. We propose learning symmetric embedding networks (SENs) that encode an input space (e.g. images), where we d… ▽ More

    Submitted 30 June, 2022; v1 submitted 24 April, 2022; originally announced April 2022.

    Comments: ICML 2022

  14. arXiv:2202.09468  [pdf, other

    cs.RO

    Sample Efficient Grasp Learning Using Equivariant Models

    Authors: Xupeng Zhu, Dian Wang, Ondrej Biza, Guanang Su, Robin Walters, Robert Platt

    Abstract: In planar grasp detection, the goal is to learn a function from an image of a scene onto a set of feasible grasp poses in $\mathrm{SE}(2)$. In this paper, we recognize that the optimal grasp function is $\mathrm{SE}(2)$-equivariant and can be modeled using an equivariant convolutional neural network. As a result, we are able to significantly improve the sample efficiency of grasp learning, obtaini… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

  15. arXiv:2202.05333  [pdf, other

    cs.RO cs.LG

    Factored World Models for Zero-Shot Generalization in Robotic Manipulation

    Authors: Ondrej Biza, Thomas Kipf, David Klee, Robert Platt, Jan-Willem van de Meent, Lawson L. S. Wong

    Abstract: World models for environments with many objects face a combinatorial explosion of states: as the number of objects increases, the number of possible arrangements grows exponentially. In this paper, we learn to generalize over robotic pick-and-place tasks using object-factored world models, which combat the combinatorial explosion by ensuring that predictions are equivariant to permutations of obje… ▽ More

    Submitted 10 February, 2022; originally announced February 2022.

  16. arXiv:2107.11676  [pdf, other

    cs.LG

    The Impact of Negative Sampling on Contrastive Structured World Models

    Authors: Ondrej Biza, Elise van der Pol, Thomas Kipf

    Abstract: World models trained by contrastive learning are a compelling alternative to autoencoder-based world models, which learn by reconstructing pixel states. In this paper, we describe three cases where small changes in how we sample negative states in the contrastive loss lead to drastic changes in model performance. In previously studied Atari datasets, we show that leveraging time step correlations… ▽ More

    Submitted 24 July, 2021; originally announced July 2021.

    Comments: This work appeared at the ICML 2021 Workshop: Self-Supervised Learning for Reasoning and Perception

  17. arXiv:2101.04178  [pdf, other

    cs.RO cs.LG

    Action Priors for Large Action Spaces in Robotics

    Authors: Ondrej Biza, Dian Wang, Robert Platt, Jan-Willem van de Meent, Lawson L. S. Wong

    Abstract: In robotics, it is often not possible to learn useful policies using pure model-free reinforcement learning without significant reward shaping or curriculum learning. As a consequence, many researchers rely on expert demonstrations to guide learning. However, acquiring expert demonstrations can be expensive. This paper proposes an alternative approach where the solutions of previously solved tasks… ▽ More

    Submitted 15 February, 2021; v1 submitted 11 January, 2021; originally announced January 2021.

    Comments: 13 pages, 9 figures

    Journal ref: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS '21). 2021. 205 - 213

  18. arXiv:2003.04300  [pdf, other

    cs.LG stat.ML

    Learning Discrete State Abstractions With Deep Variational Inference

    Authors: Ondrej Biza, Robert Platt, Jan-Willem van de Meent, Lawson L. S. Wong

    Abstract: Abstraction is crucial for effective sequential decision making in domains with large state spaces. In this work, we propose an information bottleneck method for learning approximate bisimulations, a type of state abstraction. We use a deep neural encoder to map states onto continuous embeddings. We map these embeddings onto a discrete representation using an action-conditioned hidden Markov model… ▽ More

    Submitted 11 January, 2021; v1 submitted 9 March, 2020; originally announced March 2020.

    Comments: 15 pages, 7 figures

  19. arXiv:1811.12929  [pdf, other

    cs.LG stat.ML

    Online Abstraction with MDP Homomorphisms for Deep Learning

    Authors: Ondrej Biza, Robert Platt

    Abstract: Abstraction of Markov Decision Processes is a useful tool for solving complex problems, as it can ignore unimportant aspects of an environment, simplifying the process of learning an optimal policy. In this paper, we propose a new algorithm for finding abstract MDPs in environments with continuous state spaces. It is based on MDP homomorphisms, a structure-preserving mapping between MDPs. We demon… ▽ More

    Submitted 3 April, 2019; v1 submitted 30 November, 2018; originally announced November 2018.

    Journal ref: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS '19). 2019. 1125 - 1133