Skip to main content

Showing 1–6 of 6 results for author: Barbany, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.07600  [pdf, ps, other

    cs.RO cs.CV

    Beyond Static Perception: Integrating Temporal Context into VLMs for Cloth Folding

    Authors: Oriol Barbany, Adrià Colomé, Carme Torras

    Abstract: Manipulating clothes is challenging due to their complex dynamics, high deformability, and frequent self-occlusions. Garments exhibit a nearly infinite number of configurations, making explicit state representations difficult to define. In this paper, we analyze BiFold, a model that predicts language-conditioned pick-and-place actions from visual observations, while implicitly encoding garment sta… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: Accepted at ICRA 2025 Workshop "Reflections on Representations and Manipulating Deformable Objects". Project page https://barbany.github.io/bifold/

  2. arXiv:2501.16458  [pdf, other

    cs.RO cs.CV

    BiFold: Bimanual Cloth Folding with Language Guidance

    Authors: Oriol Barbany, Adrià Colomé, Carme Torras

    Abstract: Cloth folding is a complex task due to the inevitable self-occlusions of clothes, their complicated dynamics, and the disparate materials, geometries, and textures that garments can have. In this work, we learn folding actions conditioned on text commands. Translating high-level, abstract instructions into precise robotic actions requires sophisticated language understanding and manipulation capab… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

    Comments: Accepted at ICRA 2025

  3. arXiv:2404.15790  [pdf, other

    cs.CV

    Leveraging Large Language Models for Multimodal Search

    Authors: Oriol Barbany, Michael Huang, Xinliang Zhu, Arnab Dhua

    Abstract: Multimodal search has become increasingly important in providing users with a natural and effective way to ex-press their search intentions. Images offer fine-grained details of the desired products, while text allows for easily incorporating search modifications. However, some existing multimodal search systems are unreliable and fail to address simple queries. The problem becomes harder with the… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: Published at CVPRW 2024

  4. arXiv:2311.00668  [pdf, other

    cs.CV

    ProcSim: Proxy-based Confidence for Robust Similarity Learning

    Authors: Oriol Barbany, Xiaofan Lin, Muhammet Bastan, Arnab Dhua

    Abstract: Deep Metric Learning (DML) methods aim at learning an embedding space in which distances are closely related to the inherent semantic similarity of the inputs. Previous studies have shown that popular benchmark datasets often contain numerous wrong labels, and DML methods are susceptible to them. Intending to study the effect of realistic noise, we create an ontology of the classes in a dataset an… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

    Comments: Accepted to the algorithms track of WACV 2024

  5. arXiv:2310.09543  [pdf, other

    cs.RO cs.CV cs.LG

    Benchmarking the Sim-to-Real Gap in Cloth Manipulation

    Authors: David Blanco-Mulero, Oriol Barbany, Gokhan Alcan, Adrià Colomé, Carme Torras, Ville Kyrki

    Abstract: Realistic physics engines play a crucial role for learning to manipulate deformable objects such as garments in simulation. By doing so, researchers can circumvent challenges such as sensing the deformation of the object in the realworld. In spite of the extensive use of simulations for this task, few works have evaluated the reality gap between deformable object simulators and real-world data. We… ▽ More

    Submitted 25 January, 2024; v1 submitted 14 October, 2023; originally announced October 2023.

    Comments: Accepted to IEEE Robotics and Automation Letters (RA-L). 8 pages, 6 figures. Supplementary material available at https://sites.google.com/view/cloth-sim2real-benchmark

  6. arXiv:2212.11596  [pdf, other

    cs.CV

    Deformable Surface Reconstruction via Riemannian Metric Preservation

    Authors: Oriol Barbany, Adrià Colomé, Carme Torras

    Abstract: Estimating the pose of an object from a monocular image is an inverse problem fundamental in computer vision. The ill-posed nature of this problem requires incorporating deformation priors to solve it. In practice, many materials do not perceptibly shrink or extend when manipulated, constituting a powerful and well-known prior. Mathematically, this translates to the preservation of the Riemannian… ▽ More

    Submitted 17 March, 2023; v1 submitted 22 December, 2022; originally announced December 2022.

    Comments: This paper is under consideration at Computer Vision and Image Understanding