Skip to main content

Showing 1–9 of 9 results for author: Mandi, Z

.
  1. arXiv:2505.24853  [pdf, other

    cs.RO cs.AI cs.LG

    DexMachina: Functional Retargeting for Bimanual Dexterous Manipulation

    Authors: Zhao Mandi, Yifan Hou, Dieter Fox, Yashraj Narang, Ajay Mandlekar, Shuran Song

    Abstract: We study the problem of functional retargeting: learning dexterous manipulation policies to track object states from human hand-object demonstrations. We focus on long-horizon, bimanual tasks with articulated objects, which is challenging due to large action space, spatiotemporal discontinuities, and embodiment gap between human and robot hands. We propose DexMachina, a novel curriculum-based algo… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  2. arXiv:2409.00951  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Semantically Controllable Augmentations for Generalizable Robot Learning

    Authors: Zoey Chen, Zhao Mandi, Homanga Bharadhwaj, Mohit Sharma, Shuran Song, Abhishek Gupta, Vikash Kumar

    Abstract: Generalization to unseen real-world scenarios for robot manipulation requires exposure to diverse datasets during training. However, collecting large real-world datasets is intractable due to high operational costs. For robot learning to generalize despite these challenges, it is essential to leverage sources of data or priors beyond the robot's direct experience. In this work, we posit that image… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

    Comments: Accepted for publication by IJRR. First 3 authors contributed equally. Last 3 authors advised equally

  3. arXiv:2406.08474  [pdf, other

    cs.CV cs.AI cs.LG

    Real2Code: Reconstruct Articulated Objects via Code Generation

    Authors: Zhao Mandi, Yijia Weng, Dominik Bauer, Shuran Song

    Abstract: We present Real2Code, a novel approach to reconstructing articulated objects via code generation. Given visual observations of an object, we first reconstruct its part geometry using an image segmentation model and a shape completion model. We then represent the object parts with oriented bounding boxes, which are input to a fine-tuned large language model (LLM) to predict joint articulation as co… ▽ More

    Submitted 13 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  4. arXiv:2312.00583  [pdf, other

    cs.CV cs.RO

    DeformGS: Scene Flow in Highly Deformable Scenes for Deformable Object Manipulation

    Authors: Bardienus P. Duisterhof, Zhao Mandi, Yunchao Yao, Jia-Wei Liu, Jenny Seidenschwarz, Mike Zheng Shou, Deva Ramanan, Shuran Song, Stan Birchfield, Bowen Wen, Jeffrey Ichnowski

    Abstract: Teaching robots to fold, drape, or reposition deformable objects such as cloth will unlock a variety of automation applications. While remarkable progress has been made for rigid object manipulation, manipulating deformable objects poses unique challenges, including frequent occlusions, infinite-dimensional state spaces and complex dynamics. Just as object pose estimation and tracking have aided r… ▽ More

    Submitted 30 August, 2024; v1 submitted 30 November, 2023; originally announced December 2023.

  5. arXiv:2307.04738  [pdf, other

    cs.RO cs.AI cs.LG

    RoCo: Dialectic Multi-Robot Collaboration with Large Language Models

    Authors: Zhao Mandi, Shreeya Jain, Shuran Song

    Abstract: We propose a novel approach to multi-robot collaboration that harnesses the power of pre-trained large language models (LLMs) for both high-level communication and low-level path planning. Robots are equipped with LLMs to discuss and collectively reason task strategies. They then generate sub-task plans and task space waypoint paths, which are used by a multi-arm motion planner to accelerate traje… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

  6. arXiv:2212.05711  [pdf, other

    cs.RO cs.AI cs.LG

    CACTI: A Framework for Scalable Multi-Task Multi-Scene Visual Imitation Learning

    Authors: Zhao Mandi, Homanga Bharadhwaj, Vincent Moens, Shuran Song, Aravind Rajeswaran, Vikash Kumar

    Abstract: Large-scale training have propelled significant progress in various sub-fields of AI such as computer vision and natural language processing. However, building robot learning systems at a comparable scale remains challenging. To develop robots that can perform a wide range of skills and adapt to new scenarios, efficient methods for collecting vast and diverse amounts of data on physical robot syst… ▽ More

    Submitted 16 February, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

  7. arXiv:2206.03271  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning

    Authors: Zhao Mandi, Pieter Abbeel, Stephen James

    Abstract: Intelligent agents should have the ability to leverage knowledge from previously learned tasks in order to learn new ones quickly and efficiently. Meta-learning approaches have emerged as a popular solution to achieve this. However, meta-reinforcement learning (meta-RL) algorithms have thus far been restricted to simple environments with narrow task distributions. Moreover, the paradigm of pretrai… ▽ More

    Submitted 16 February, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

  8. arXiv:2110.13423  [pdf, other

    cs.RO cs.AI cs.LG

    Towards More Generalizable One-shot Visual Imitation Learning

    Authors: Zhao Mandi, Fangchen Liu, Kimin Lee, Pieter Abbeel

    Abstract: A general-purpose robot should be able to master a wide range of tasks and quickly learn a novel one by leveraging past experiences. One-shot imitation learning (OSIL) approaches this goal by training an agent with (pairs of) expert demonstrations, such that at test time, it can directly execute a new task from just one demonstration. However, so far this framework has been limited to training on… ▽ More

    Submitted 8 February, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

  9. arXiv:2109.07380  [pdf, other

    cs.LG cs.RO

    DCUR: Data Curriculum for Teaching via Samples with Reinforcement Learning

    Authors: Daniel Seita, Abhinav Gopal, Zhao Mandi, John Canny

    Abstract: Deep reinforcement learning (RL) has shown great empirical successes, but suffers from brittleness and sample inefficiency. A potential remedy is to use a previously-trained policy as a source of supervision. In this work, we refer to these policies as teachers and study how to transfer their expertise to new student policies by focusing on data usage. We propose a framework, Data CUrriculum for R… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

    Comments: Supplementary material is available at https://tinyurl.com/teach-dcur