Skip to main content

Showing 1–4 of 4 results for author: Kapelyukh, I

.
  1. arXiv:2312.04533  [pdf, other

    cs.RO cs.CV cs.LG

    Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models

    Authors: Ivan Kapelyukh, Yifei Ren, Ignacio Alzugaray, Edward Johns

    Abstract: We introduce Dream2Real, a robotics framework which integrates vision-language models (VLMs) trained on 2D data into a 3D object rearrangement pipeline. This is achieved by the robot autonomously constructing a 3D representation of the scene, where objects can be rearranged virtually and an image of the resulting arrangement rendered. These renders are evaluated by a VLM, so that the arrangement w… ▽ More

    Submitted 29 July, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: ICRA 2024. Project webpage with robot videos: https://www.robot-learning.uk/dream2real

  2. arXiv:2311.08530  [pdf, other

    cs.RO cs.CV cs.LG

    SceneScore: Learning a Cost Function for Object Arrangement

    Authors: Ivan Kapelyukh, Edward Johns

    Abstract: Arranging objects correctly is a key capability for robots which unlocks a wide range of useful tasks. A prerequisite for creating successful arrangements is the ability to evaluate the desirability of a given arrangement. Our method "SceneScore" learns a cost function for arrangements, such that desirable, human-like arrangements have a low cost. We learn the distribution of training arrangements… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: Presented at CoRL 2023 LEAP Workshop. Webpage: https://sites.google.com/view/scenescore

  3. arXiv:2210.02438  [pdf, other

    cs.RO cs.CV cs.LG

    DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics

    Authors: Ivan Kapelyukh, Vitalis Vosylius, Edward Johns

    Abstract: We introduce the first work to explore web-scale diffusion models for robotics. DALL-E-Bot enables a robot to rearrange objects in a scene, by first inferring a text description of those objects, then generating an image representing a natural, human-like arrangement of those objects, and finally physically arranging the objects according to that goal image. We show that this is possible zero-shot… ▽ More

    Submitted 4 May, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: Webpage and videos: ( https://www.robot-learning.uk/dall-e-bot ) Published in IEEE Robotics and Automation Letters (RA-L)

  4. arXiv:2111.03112  [pdf, other

    cs.RO cs.LG

    My House, My Rules: Learning Tidying Preferences with Graph Neural Networks

    Authors: Ivan Kapelyukh, Edward Johns

    Abstract: Robots that arrange household objects should do so according to the user's preferences, which are inherently subjective and difficult to model. We present NeatNet: a novel Variational Autoencoder architecture using Graph Neural Network layers, which can extract a low-dimensional latent preference vector from a user by observing how they arrange scenes. Given any set of objects, this vector can the… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

    Comments: Published at CoRL 2021. Webpage and video: https://www.robot-learning.uk/my-house-my-rules