Skip to main content

Showing 1–50 of 125 results for author: Kragic, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.08644  [pdf, ps, other

    cs.CV cs.RO

    DLO-Splatting: Tracking Deformable Linear Objects Using 3D Gaussian Splatting

    Authors: Holly Dinkel, Marcel Büsching, Alberta Longhini, Brian Coltin, Trey Smith, Danica Kragic, Mårten Björkman, Timothy Bretl

    Abstract: This work presents DLO-Splatting, an algorithm for estimating the 3D shape of Deformable Linear Objects (DLOs) from multi-view RGB images and gripper state information through prediction-update filtering. The DLO-Splatting algorithm uses a position-based dynamics model with shape smoothness and rigidity dampening corrections to predict the object shape. Optimization with a 3D Gaussian Splatting-ba… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

    Comments: 5 pages, 2 figures, presented at the 2025 5th Workshop: Reflections on Representations and Manipulating Deformable Objects at the IEEE International Conference on Robotics and Automation. RMDO workshop (https://deformable-workshop.github.io/icra2025/)

  2. arXiv:2504.10002  [pdf, other

    cs.RO cs.LG

    FLoRA: Sample-Efficient Preference-based RL via Low-Rank Style Adaptation of Reward Functions

    Authors: Daniel Marta, Simon Holk, Miguel Vasco, Jens Lundell, Timon Homberger, Finn Busch, Olov Andersson, Danica Kragic, Iolanda Leite

    Abstract: Preference-based reinforcement learning (PbRL) is a suitable approach for style adaptation of pre-trained robotic behavior: adapting the robot's policy to follow human user preferences while still being able to perform the original task. However, collecting preferences for the adaptation process in robotics is often challenging and time-consuming. In this work we explore the adaptation of pre-trai… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: Accepted at 2025 IEEE International Conference on Robotics & Automation (ICRA). We provide videos of our results and source code at https://sites.google.com/view/preflora/

  3. arXiv:2503.22370  [pdf, other

    cs.RO cs.LG

    Grasping a Handful: Sequential Multi-Object Dexterous Grasp Generation

    Authors: Haofei Lu, Yifei Dong, Zehang Weng, Jens Lundell, Danica Kragic

    Abstract: We introduce the sequential multi-object robotic grasp sampling algorithm SeqGrasp that can robustly synthesize stable grasps on diverse objects using the robotic hand's partial Degrees of Freedom (DoF). We use SeqGrasp to construct the large-scale Allegro Hand sequential grasping dataset SeqDataset and use it for training the diffusion-based sequential grasp generator SeqDiffuser. We experimental… ▽ More

    Submitted 31 March, 2025; v1 submitted 28 March, 2025; originally announced March 2025.

    Comments: 8 pages, 7 figures

  4. arXiv:2503.14268  [pdf, other

    cs.RO

    Pushing Everything Everywhere All At Once: Probabilistic Prehensile Pushing

    Authors: Patrizio Perugini, Jens Lundell, Katharina Friedl, Danica Kragic

    Abstract: We address prehensile pushing, the problem of manipulating a grasped object by pushing against the environment. Our solution is an efficient nonlinear trajectory optimization problem relaxed from an exact mixed integer non-linear trajectory optimization formulation. The critical insight is recasting the external pushers (environment) as a discrete probability distribution instead of binary variabl… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

    Comments: This paper has been accepted for publication in the IEEE Robotics and Automation Letters (RA-L)

  5. arXiv:2503.02587  [pdf, other

    cs.RO

    Learning Dexterous In-Hand Manipulation with Multifingered Hands via Visuomotor Diffusion

    Authors: Piotr Koczy, Michael C. Welle, Danica Kragic

    Abstract: We present a framework for learning dexterous in-hand manipulation with multifingered hands using visuomotor diffusion policies. Our system enables complex in-hand manipulation tasks, such as unscrewing a bottle lid with one hand, by leveraging a fast and responsive teleoperation setup for the four-fingered Allegro Hand. We collect high-quality expert demonstrations using an augmented reality (AR)… ▽ More

    Submitted 4 March, 2025; originally announced March 2025.

  6. arXiv:2503.01729  [pdf, other

    cs.RO

    FLAME: A Federated Learning Benchmark for Robotic Manipulation

    Authors: Santiago Bou Betran, Alberta Longhini, Miguel Vasco, Yuchong Zhang, Danica Kragic

    Abstract: Recent progress in robotic manipulation has been fueled by large-scale datasets collected across diverse environments. Training robotic manipulation policies on these datasets is traditionally performed in a centralized manner, raising concerns regarding scalability, adaptability, and data privacy. While federated learning enables decentralized, privacy-preserving training, its application to robo… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

    Comments: Under Review

  7. arXiv:2502.15367  [pdf, other

    cs.HC cs.SD eess.AS

    Advancing User-Voice Interaction: Exploring Emotion-Aware Voice Assistants Through a Role-Swapping Approach

    Authors: Yong Ma, Yuchong Zhang, Di Fu, Stephanie Zubicueta Portales, Danica Kragic, Morten Fjeld

    Abstract: As voice assistants (VAs) become increasingly integrated into daily life, the need for emotion-aware systems that can recognize and respond appropriately to user emotions has grown. While significant progress has been made in speech emotion recognition (SER) and sentiment analysis, effectively addressing user emotions-particularly negative ones-remains a challenge. This study explores human emotio… ▽ More

    Submitted 21 February, 2025; originally announced February 2025.

    Comments: 19 pages, 6 figures

  8. arXiv:2502.11752  [pdf, other

    cs.RO cs.HC

    Early Detection of Human Handover Intentions in Human-Robot Collaboration: Comparing EEG, Gaze, and Hand Motion

    Authors: Parag Khanna, Nona Rajabi, Sumeyra U. Demir Kanik, Danica Kragic, Mårten Björkman, Christian Smith

    Abstract: Human-robot collaboration (HRC) relies on accurate and timely recognition of human intentions to ensure seamless interactions. Among common HRC tasks, human-to-robot object handovers have been studied extensively for planning the robot's actions during object reception, assuming the human intention for object handover. However, distinguishing handover intentions from other actions has received lim… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

    Comments: In submission at Robotics and Autonomous Systems, 2025

  9. arXiv:2502.09389  [pdf, other

    cs.RO cs.AI

    S$^2$-Diffusion: Generalizing from Instance-level to Category-level Skills in Robot Manipulation

    Authors: Quantao Yang, Michael C. Welle, Danica Kragic, Olov Andersson

    Abstract: Recent advances in skill learning has propelled robot manipulation to new heights by enabling it to learn complex manipulation tasks from a practical number of demonstrations. However, these skills are often limited to the particular action, object, and environment \textit{instances} that are shown in the training data, and have trouble transferring to other instances of the same category. In this… ▽ More

    Submitted 17 February, 2025; v1 submitted 13 February, 2025; originally announced February 2025.

  10. arXiv:2502.09142  [pdf, other

    cs.HC cs.RO

    LLM-Driven Augmented Reality Puppeteer: Controller-Free Voice-Commanded Robot Teleoperation

    Authors: Yuchong Zhang, Bastian Orthmann, Michael C. Welle, Jonne Van Haastregt, Danica Kragic

    Abstract: The integration of robotics and augmented reality (AR) presents transformative opportunities for advancing human-robot interaction (HRI) by improving usability, intuitiveness, and accessibility. This work introduces a controller-free, LLM-driven voice-commanded AR puppeteering system, enabling users to teleoperate a robot by manipulating its virtual counterpart in real time. By leveraging natural… ▽ More

    Submitted 13 February, 2025; originally announced February 2025.

    Comments: Accepted as conference proceeding in International Conference on Human-Computer Interaction 2025 (HCI International 2025)

  11. arXiv:2502.04809  [pdf, other

    cs.LG

    Humans Co-exist, So Must Embodied Artificial Agents

    Authors: Hannah Kuehn, Joseph La Delfa, Miguel Vasco, Danica Kragic, Iolanda Leite

    Abstract: Modern embodied artificial agents excel in static, predefined tasks but fall short in dynamic and long-term interactions with humans. On the other hand, humans can adapt and evolve continuously, exploiting the situated knowledge embedded in their environment and other agents, thus contributing to meaningful interactions. We introduce the concept of co-existence for embodied artificial agents and a… ▽ More

    Submitted 10 February, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

  12. arXiv:2502.03081  [pdf, other

    cs.CV cs.LG

    Human-Aligned Image Models Improve Visual Decoding from the Brain

    Authors: Nona Rajabi, Antônio H. Ribeiro, Miguel Vasco, Farzaneh Taleb, Mårten Björkman, Danica Kragic

    Abstract: Decoding visual images from brain activity has significant potential for advancing brain-computer interaction and enhancing the understanding of human perception. Recent approaches align the representation spaces of images and brain activity to enable visual decoding. In this paper, we introduce the use of human-aligned image encoders to map brain signals to images. We hypothesize that these model… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

  13. arXiv:2502.02308  [pdf, other

    cs.RO cs.LG

    Real-Time Operator Takeover for Visuomotor Diffusion Policy Training

    Authors: Nils Ingelhag, Jesper Munkeby, Michael C. Welle, Marco Moletta, Danica Kragic

    Abstract: We present a Real-Time Operator Takeover (RTOT) paradigm enabling operators to seamlessly take control of a live visuomotor diffusion policy, guiding the system back into desirable states or reinforcing specific demonstrations. We present new insights in using the Mahalonobis distance to automatically identify undesirable states. Once the operator has intervened and redirected the system, the cont… ▽ More

    Submitted 13 February, 2025; v1 submitted 4 February, 2025; originally announced February 2025.

  14. arXiv:2501.01715  [pdf, other

    cs.CV cs.RO

    Cloth-Splatting: 3D Cloth State Estimation from RGB Supervision

    Authors: Alberta Longhini, Marcel Büsching, Bardienus P. Duisterhof, Jens Lundell, Jeffrey Ichnowski, Mårten Björkman, Danica Kragic

    Abstract: We introduce Cloth-Splatting, a method for estimating 3D states of cloth from RGB images through a prediction-update framework. Cloth-Splatting leverages an action-conditioned dynamics model for predicting future states and uses 3D Gaussian Splatting to update the predicted states. Our key insight is that coupling a 3D mesh-based representation with Gaussian Splatting allows us to define a differe… ▽ More

    Submitted 3 January, 2025; originally announced January 2025.

    Comments: Accepted at the 8th Conference on Robot Learning (CoRL 2024). Code and videos available at: kth-rpl.github.io/cloth-splatting

  15. arXiv:2411.04331  [pdf, other

    cs.RO

    Raising Body Ownership in End-to-End Visuomotor Policy Learning via Robot-Centric Pooling

    Authors: Zheyu Zhuang, Ville Kyrki, Danica Kragic

    Abstract: We present Robot-centric Pooling (RcP), a novel pooling method designed to enhance end-to-end visuomotor policies by enabling differentiation between the robots and similar entities or their surroundings. Given an image-proprioception pair, RcP guides the aggregation of image features by highlighting image regions correlating with the robot's proprioceptive states, thereby extracting robot-centric… ▽ More

    Submitted 6 November, 2024; originally announced November 2024.

    Comments: Accepted at IROS 2024

  16. arXiv:2411.03038  [pdf, other

    cs.LG

    Can Transformers Smell Like Humans?

    Authors: Farzaneh Taleb, Miguel Vasco, Antônio H. Ribeiro, Mårten Björkman, Danica Kragic

    Abstract: The human brain encodes stimuli from the environment into representations that form a sensory perception of the world. Despite recent advances in understanding visual and auditory perception, olfactory perception remains an under-explored topic in the machine learning community due to the lack of large-scale datasets annotated with labels of human olfactory perception. In this work, we ask the que… ▽ More

    Submitted 5 November, 2024; originally announced November 2024.

    Comments: Spotlight paper at NeurIPS 2024

  17. arXiv:2410.18868  [pdf, other

    cs.LG

    A Riemannian Framework for Learning Reduced-order Lagrangian Dynamics

    Authors: Katharina Friedl, Noémie Jaquier, Jens Lundell, Tamim Asfour, Danica Kragic

    Abstract: By incorporating physical consistency as inductive bias, deep neural networks display increased generalization capabilities and data efficiency in learning nonlinear dynamic models. However, the complexity of these models generally increases with the system dimensionality, requiring larger datasets, more complex deep networks, and significant computational effort. We propose a novel geometric netw… ▽ More

    Submitted 28 February, 2025; v1 submitted 24 October, 2024; originally announced October 2024.

    Comments: 28 pages, 16 figures. Accepted for publication in ICLR'25

  18. arXiv:2410.01476  [pdf, other

    cs.LG stat.ML

    Reducing Variance in Meta-Learning via Laplace Approximation for Regression Tasks

    Authors: Alfredo Reichlin, Gustaf Tegnér, Miguel Vasco, Hang Yin, Mårten Björkman, Danica Kragic

    Abstract: Given a finite set of sample points, meta-learning algorithms aim to learn an optimal adaptation strategy for new, unseen tasks. Often, this data can be ambiguous as it might belong to different tasks concurrently. This is particularly the case in meta-regression tasks. In such cases, the estimated adaptation strategy is subject to high variance due to the limited amount of support data for each t… ▽ More

    Submitted 23 October, 2024; v1 submitted 2 October, 2024; originally announced October 2024.

  19. arXiv:2409.20248  [pdf, other

    cs.RO

    Feature Extractor or Decision Maker: Rethinking the Role of Visual Encoders in Visuomotor Policies

    Authors: Ruiyu Wang, Zheyu Zhuang, Shutong Jin, Nils Ingelhag, Danica Kragic, Florian T. Pokorny

    Abstract: An end-to-end (E2E) visuomotor policy is typically treated as a unified whole, but recent approaches using out-of-domain (OOD) data to pretrain the visual encoder have cleanly separated the visual encoder from the network, with the remainder referred to as the policy. We propose Visual Alignment Testing, an experimental framework designed to evaluate the validity of this functional separation. Our… ▽ More

    Submitted 14 May, 2025; v1 submitted 30 September, 2024; originally announced September 2024.

  20. arXiv:2409.11150  [pdf, ps, other

    cs.RO

    The 1st InterAI Workshop: Interactive AI for Human-centered Robotics

    Authors: Yuchong Zhang, Elmira Yadollahi, Yong Ma, Di Fu, Iolanda Leite, Danica Kragic

    Abstract: The workshop is affiliated with 33nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN 2024) August 26~30, 2023 / Pasadena, CA, USA. It is designed as a half-day event, extending over four hours from 9:00 to 12:30 PST time. It accommodates both in-person and virtual attendees (via Zoom), ensuring a flexible participation mode. The agenda is thoughtfully crafted to… ▽ More

    Submitted 11 October, 2024; v1 submitted 17 September, 2024; originally announced September 2024.

  21. arXiv:2409.10967  [pdf, other

    cs.LG

    Relative Representations: Topological and Geometric Perspectives

    Authors: Alejandro García-Castellanos, Giovanni Luca Marchetti, Danica Kragic, Martina Scolamiero

    Abstract: Relative representations are an established approach to zero-shot model stitching, consisting of a non-trainable transformation of the latent space of a deep neural network. Based on insights of topological and geometric nature, we propose two improvements to relative representations. First, we introduce a normalization procedure in the relative transformation, resulting in invariance to non-isotr… ▽ More

    Submitted 15 April, 2025; v1 submitted 17 September, 2024; originally announced September 2024.

  22. HyperSteiner: Computing Heuristic Hyperbolic Steiner Minimal Trees

    Authors: Alejandro García-Castellanos, Aniss Aiman Medbouhi, Giovanni Luca Marchetti, Erik J. Bekkers, Danica Kragic

    Abstract: We propose HyperSteiner -- an efficient heuristic algorithm for computing Steiner minimal trees in the hyperbolic space. HyperSteiner extends the Euclidean Smith-Lee-Liebman algorithm, which is grounded in a divide-and-conquer approach involving the Delaunay triangulation. The central idea is rephrasing Steiner tree problems with three terminals as a system of equations in the Klein-Beltrami model… ▽ More

    Submitted 13 January, 2025; v1 submitted 9 September, 2024; originally announced September 2024.

    Journal ref: Proceedings of the Symposium on Algorithm Engineering and Experiments (2025) 194-208

  23. arXiv:2407.11741  [pdf, other

    cs.RO

    Puppeteer Your Robot: Augmented Reality Leader-Follower Teleoperation

    Authors: Jonne van Haastregt, Michael C. Welle, Yuchong Zhang, Danica Kragic

    Abstract: High-quality demonstrations are necessary when learning complex and challenging manipulation tasks. In this work, we introduce an approach to puppeteer a robot by controlling a virtual robot in an augmented reality setting. Our system allows for retaining the advantages of being intuitive from a physical leader-follower side while avoiding the unnecessary use of expensive physical setup. In additi… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  24. arXiv:2407.01361  [pdf, other

    cs.RO

    Unfolding the Literature: A Review of Robotic Cloth Manipulation

    Authors: Alberta Longhini, Yufei Wang, Irene Garcia-Camacho, David Blanco-Mulero, Marco Moletta, Michael Welle, Guillem Alenyà, Hang Yin, Zackory Erickson, David Held, Júlia Borràs, Danica Kragic

    Abstract: The realm of textiles spans clothing, households, healthcare, sports, and industrial applications. The deformable nature of these objects poses unique challenges that prior work on rigid objects cannot fully address. The increasing interest within the community in textile perception and manipulation has led to new methods that aim to address challenges in modeling, perception, and control, resulti… ▽ More

    Submitted 16 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: 30 pages, 3 figures, 2 tables. Submitted to Annual Review of Control, Robotics, and Autonomous Systems

  25. Vision Beyond Boundaries: An Initial Design Space of Domain-specific Large Vision Models in Human-robot Interaction

    Authors: Yuchong Zhang, Yong Ma, Danica Kragic

    Abstract: The emergence of large vision models (LVMs) is following in the footsteps of the recent prosperity of Large Language Models (LLMs) in following years. However, there's a noticeable gap in structured research applying LVMs to human-robot interaction (HRI), despite extensive evidence supporting the efficacy of vision models in enhancing interactions between humans and robots. Recognizing the vast an… ▽ More

    Submitted 16 September, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  26. arXiv:2404.08608  [pdf, other

    cs.LG

    Hyperbolic Delaunay Geometric Alignment

    Authors: Aniss Aiman Medbouhi, Giovanni Luca Marchetti, Vladislav Polianskii, Alexander Kravberg, Petra Poklukar, Anastasia Varava, Danica Kragic

    Abstract: Hyperbolic machine learning is an emerging field aimed at representing data with a hierarchical structure. However, there is a lack of tools for evaluation and analysis of the resulting hyperbolic data representations. To this end, we propose Hyperbolic Delaunay Geometric Alignment (HyperDGA) -- a similarity score for comparing datasets in a hyperbolic space. The core idea is counting the edges of… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  27. arXiv:2403.18616  [pdf, other

    cs.HC cs.RO

    Will You Participate? Exploring the Potential of Robotics Competitions on Human-centric Topics

    Authors: Yuchong Zhang, Miguel Vasco, Mårten Björkman, Danica Kragic

    Abstract: This paper presents findings from an exploratory needfinding study investigating the research current status and potential participation of the competitions on the robotics community towards four human-centric topics: safety, privacy, explainability, and federated learning. We conducted a survey with 34 participants across three distinguished European robotics consortia, nearly 60% of whom possess… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Journal ref: International Conference on Human-Computer Interaction (HCII) 2024

  28. arXiv:2403.16781  [pdf, other

    cs.RO

    Visual Action Planning with Multiple Heterogeneous Agents

    Authors: Martina Lippi, Michael C. Welle, Marco Moletta, Alessandro Marino, Andrea Gasparri, Danica Kragic

    Abstract: Visual planning methods are promising to handle complex settings where extracting the system state is challenging. However, none of the existing works tackles the case of multiple heterogeneous agents which are characterized by different capabilities and/or embodiment. In this work, we propose a method to realize visual action planning in multi-agent settings by exploiting a roadmap built in a low… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  29. arXiv:2403.16764  [pdf, other

    cs.RO

    Low-Cost Teleoperation with Haptic Feedback through Vision-based Tactile Sensors for Rigid and Soft Object Manipulation

    Authors: Martina Lippi, Michael C. Welle, Maciej K. Wozniak, Andrea Gasparri, Danica Kragic

    Abstract: Haptic feedback is essential for humans to successfully perform complex and delicate manipulation tasks. A recent rise in tactile sensors has enabled robots to leverage the sense of touch and expand their capability drastically. However, many tasks still need human intervention/guidance. For this reason, we present a teleoperation framework designed to provide haptic feedback to human operators ba… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: https://vision-tactile-manip.github.io/teleop/

  30. arXiv:2403.16730  [pdf, other

    cs.RO

    A Robotic Skill Learning System Built Upon Diffusion Policies and Foundation Models

    Authors: Nils Ingelhag, Jesper Munkeby, Jonne van Haastregt, Anastasia Varava, Michael C. Welle, Danica Kragic

    Abstract: In this paper, we build upon two major recent developments in the field, Diffusion Policies for visuomotor manipulation and large pre-trained multimodal foundational models to obtain a robotic skill learning system. The system can obtain new skills via the behavioral cloning approach of visuomotor diffusion policies given teleoperated demonstrations. Foundational models are being used to perform s… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: https://roboskillframework.github.io

  31. arXiv:2403.09407  [pdf, other

    cs.SD cs.AI cs.LG cs.MM eess.AS

    LM2D: Lyrics- and Music-Driven Dance Synthesis

    Authors: Wenjie Yin, Xuejiao Zhao, Yi Yu, Hang Yin, Danica Kragic, Mårten Björkman

    Abstract: Dance typically involves professional choreography with complex movements that follow a musical rhythm and can also be influenced by lyrical content. The integration of lyrics in addition to the auditory dimension, enriches the foundational tone and makes motion generation more amenable to its semantic meanings. However, existing dance synthesis methods tend to model motions only conditioned on au… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  32. arXiv:2403.06210  [pdf, other

    cs.RO

    AdaFold: Adapting Folding Trajectories of Cloths via Feedback-loop Manipulation

    Authors: Alberta Longhini, Michael C. Welle, Zackory Erickson, Danica Kragic

    Abstract: We present AdaFold, a model-based feedback-loop framework for optimizing folding trajectories. AdaFold extracts a particle-based representation of cloth from RGB-D images and feeds back the representation to a model predictive control to replan folding trajectory at every time step. A key component of AdaFold that enables feedback-loop manipulation is the use of semantic descriptors extracted from… ▽ More

    Submitted 20 December, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

    Comments: 8 pages, 6 figures, 5 tables

  33. arXiv:2403.06186  [pdf, other

    cs.RO cs.HC

    Mind Meets Robots: A Review of EEG-Based Brain-Robot Interaction Systems

    Authors: Yuchong Zhang, Nona Rajabi, Farzaneh Taleb, Andrii Matviienko, Yong Ma, Mårten Björkman, Danica Kragic

    Abstract: Brain-robot interaction (BRI) empowers individuals to control (semi-)automated machines through their brain activity, either passively or actively. In the past decade, BRI systems have achieved remarkable success, predominantly harnessing electroencephalogram (EEG) signals as the central component. This paper offers an up-to-date and exhaustive examination of 87 curated studies published during th… ▽ More

    Submitted 25 March, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

  34. arXiv:2403.05177  [pdf, other

    cs.RO

    Interactive Perception for Deformable Object Manipulation

    Authors: Zehang Weng, Peng Zhou, Hang Yin, Alexander Kravberg, Anastasiia Varava, David Navarro-Alarcon, Danica Kragic

    Abstract: Interactive perception enables robots to manipulate the environment and objects to bring them into states that benefit the perception process. Deformable objects pose challenges to this due to significant manipulation difficulty and occlusion in vision-based perception. In this work, we address such a problem with a setup involving both an active camera and an object manipulator. Our approach is b… ▽ More

    Submitted 11 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: This work has been submitted to the IEEE for possible publication

  35. Standardization of Cloth Objects and its Relevance in Robotic Manipulation

    Authors: Irene Garcia-Camacho, Alberta Longhini, Michael Welle, Guillem Alenyà, Danica Kragic, Júlia Borràs

    Abstract: The field of robotics faces inherent challenges in manipulating deformable objects, particularly in understanding and standardising fabric properties like elasticity, stiffness, and friction. While the significance of these properties is evident in the realm of cloth manipulation, accurately categorising and comprehending them in real-world applications remains elusive. This study sets out to addr… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 2024 ICRA International Conference on Robotics and Automation (ICRA)

    Journal ref: 2024 ICRA International Conference on Robotics and Automation (ICRA), Yokohama, Japan, 2024, pp. 8298-8304

  36. arXiv:2402.10820  [pdf, other

    cs.LG

    Learning Goal-Conditioned Policies from Sub-Optimal Offline Data via Metric Learning

    Authors: Alfredo Reichlin, Miguel Vasco, Hang Yin, Danica Kragic

    Abstract: We address the problem of learning optimal behavior from sub-optimal datasets for goal-conditioned offline reinforcement learning. To do so, we propose the use of metric learning to approximate the optimal value function for goal-conditioned offline RL problems under sparse rewards, invertible actions and deterministic transitions. We introduce distance monotonicity, a property for representations… ▽ More

    Submitted 8 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  37. arXiv:2402.06665  [pdf, other

    cs.AI cs.CL cs.LG cs.RO

    The Essential Role of Causality in Foundation World Models for Embodied AI

    Authors: Tarun Gupta, Wenbo Gong, Chao Ma, Nick Pawlowski, Agrin Hilmkil, Meyer Scetbon, Marc Rigter, Ade Famoti, Ashley Juan Llorens, Jianfeng Gao, Stefan Bauer, Danica Kragic, Bernhard Schölkopf, Cheng Zhang

    Abstract: Recent advances in foundation models, especially in large multi-modal models and conversational agents, have ignited interest in the potential of generally capable embodied agents. Such agents will require the ability to perform new tasks in many different real-world environments. However, current foundation models fail to accurately model physical interactions and are therefore insufficient for E… ▽ More

    Submitted 29 April, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  38. arXiv:2402.02989  [pdf, other

    cs.RO cs.LG

    DexDiffuser: Generating Dexterous Grasps with Diffusion Models

    Authors: Zehang Weng, Haofei Lu, Danica Kragic, Jens Lundell

    Abstract: We introduce DexDiffuser, a novel dexterous grasping method that generates, evaluates, and refines grasps on partial object point clouds. DexDiffuser includes the conditional diffusion-based grasp sampler DexSampler and the dexterous grasp evaluator DexEvaluator. DexSampler generates high-quality grasps conditioned on object point clouds by iterative denoising of randomly sampled grasps. We also i… ▽ More

    Submitted 6 November, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: 7 pages

  39. arXiv:2312.08550  [pdf, other

    cs.LG cs.AI eess.SP

    Harmonics of Learning: Universal Fourier Features Emerge in Invariant Networks

    Authors: Giovanni Luca Marchetti, Christopher Hillar, Danica Kragic, Sophia Sanborn

    Abstract: In this work, we formally prove that, under certain conditions, if a neural network is invariant to a finite group then its weights recover the Fourier transform on that group. This provides a mathematical explanation for the emergence of Fourier features -- a ubiquitous phenomenon in both biological and artificial learning systems. The results hold even for non-commutative groups, in which case t… ▽ More

    Submitted 14 June, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

    Comments: Accepted at the Conference on Learning Theory (COLT) 2024

  40. arXiv:2312.07311  [pdf, other

    cs.CV cs.AI cs.LG

    Scalable Motion Style Transfer with Constrained Diffusion Generation

    Authors: Wenjie Yin, Yi Yu, Hang Yin, Danica Kragic, Mårten Björkman

    Abstract: Current training of motion style transfer systems relies on consistency losses across style domains to preserve contents, hindering its scalable application to a large number of domains and private data. Recent image transfer works show the potential of independent training on each domain by leveraging implicit bridging between diffusion models, with the content preservation, however, limited to s… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  41. arXiv:2311.18044  [pdf, other

    cs.RO cs.LG

    Transfer Learning in Robotics: An Upcoming Breakthrough? A Review of Promises and Challenges

    Authors: Noémie Jaquier, Michael C. Welle, Andrej Gams, Kunpeng Yao, Bernardo Fichera, Aude Billard, Aleš Ude, Tamim Asfour, Danica Kragic

    Abstract: Transfer learning is a conceptually-enticing paradigm in pursuit of truly intelligent embodied agents. The core concept -- reusing prior knowledge to learn in and from novel situations -- is successfully leveraged by humans to handle novel situations. In recent years, transfer learning has received renewed interest from the community from different perspectives, including imitation learning, domai… ▽ More

    Submitted 2 May, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: 21 pages, 7 figures

  42. arXiv:2310.12113  [pdf, other

    cs.RO

    CAPGrasp: An $\mathbb{R}^3\times \text{SO(2)-equivariant}$ Continuous Approach-Constrained Generative Grasp Sampler

    Authors: Zehang Weng, Haofei Lu, Jens Lundell, Danica Kragic

    Abstract: We propose CAPGrasp, an $\mathbb{R}^3\times \text{SO(2)-equivariant}$ 6-DoF continuous approach-constrained generative grasp sampler. It includes a novel learning strategy for training CAPGrasp that eliminates the need to curate massive conditionally labeled datasets and a constrained grasp refinement technique that improves grasp poses while respecting the grasp approach directional constraints.… ▽ More

    Submitted 7 March, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: This work has been submitted to the IEEE for possible publication

  43. arXiv:2310.00455  [pdf, other

    cs.MM cs.GR cs.LG cs.SD eess.AS

    Music- and Lyrics-driven Dance Synthesis

    Authors: Wenjie Yin, Qingyuan Yao, Yi Yu, Hang Yin, Danica Kragic, Mårten Björkman

    Abstract: Lyrics often convey information about the songs that are beyond the auditory dimension, enriching the semantic meaning of movements and musical themes. Such insights are important in the dance choreography domain. However, most existing dance synthesis methods mainly focus on music-to-dance generation, without considering the semantic information. To complement it, we introduce JustLMD, a new mult… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

  44. arXiv:2309.05346  [pdf, other

    cs.LG cs.CV

    Learning Geometric Representations of Objects via Interaction

    Authors: Alfredo Reichlin, Giovanni Luca Marchetti, Hang Yin, Anastasiia Varava, Danica Kragic

    Abstract: We address the problem of learning representations from observations of a scene involving an agent and an external object the agent interacts with. To this end, we propose a representation learning framework extracting the location in physical space of both the agent and the object from unstructured observations of arbitrary nature. Our framework relies on the actions performed by the agent as the… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

  45. arXiv:2306.05791  [pdf, other

    cs.RO

    Enabling Robot Manipulation of Soft and Rigid Objects with Vision-based Tactile Sensors

    Authors: Michael C. Welle, Martina Lippi, Haofei Lu, Jens Lundell, Andrea Gasparri, Danica Kragic

    Abstract: Endowing robots with tactile capabilities opens up new possibilities for their interaction with the environment, including the ability to handle fragile and/or soft objects. In this work, we equip the robot gripper with low-cost vision-based tactile sensors and propose a manipulation algorithm that adapts to both rigid and soft objects without requiring any knowledge of their properties. The algor… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

    Comments: Published in IEEE International Conference on Automation Science and Engineering (CASE2023)

  46. arXiv:2305.18120  [pdf, other

    cs.CV

    TD-GEM: Text-Driven Garment Editing Mapper

    Authors: Reza Dadfar, Sanaz Sabzevari, Mårten Björkman, Danica Kragic

    Abstract: Language-based fashion image editing allows users to try out variations of desired garments through provided text prompts. Inspired by research on manipulating latent representations in StyleCLIP and HairCLIP, we focus on these latent spaces for editing fashion items of full-body human datasets. Currently, there is a gap in handling fashion image editing due to the complexity of garment shapes and… ▽ More

    Submitted 26 July, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: The first two authors contributed equally

  47. arXiv:2305.07493  [pdf, other

    cs.RO

    A Virtual Reality Framework for Human-Robot Collaboration in Cloth Folding

    Authors: Marco Moletta, Maciej K. Wozniak, Michael C. Welle, Danica Kragic

    Abstract: We present a virtual reality (VR) framework to automate the data collection process in cloth folding tasks. The framework uses skeleton representations to help the user define the folding plans for different classes of garments, allowing for replicating the folding on unseen items of the same class. We evaluate the framework in the context of automating garment folding tasks. A quantitative analys… ▽ More

    Submitted 14 December, 2023; v1 submitted 12 May, 2023; originally announced May 2023.

  48. arXiv:2304.04681  [pdf, other

    cs.CV cs.LG

    Controllable Motion Synthesis and Reconstruction with Autoregressive Diffusion Models

    Authors: Wenjie Yin, Ruibo Tu, Hang Yin, Danica Kragic, Hedvig Kjellström, Mårten Björkman

    Abstract: Data-driven and controllable human motion synthesis and prediction are active research areas with various applications in interactive media and social robotics. Challenges remain in these fields for generating diverse motions given past observations and dealing with imperfect poses. This paper introduces MoDiff, an autoregressive probabilistic diffusion model over motion sequences conditioned on c… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  49. arXiv:2303.15115  [pdf, other

    cs.RO

    Ensemble Latent Space Roadmap for Improved Robustness in Visual Action Planning

    Authors: Martina Lippi, Michael C. Welle, Andrea Gasparri, Danica Kragic

    Abstract: Planning in learned latent spaces helps to decrease the dimensionality of raw observations. In this work, we propose to leverage the ensemble paradigm to enhance the robustness of latent planning systems. We rely on our Latent Space Roadmap (LSR) framework, which builds a graph in a learned structured latent space to perform planning. Given multiple LSR framework instances, that differ either on t… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

  50. arXiv:2303.07972  [pdf, other

    cs.RO

    GoNet: An Approach-Constrained Generative Grasp Sampling Network

    Authors: Zehang Weng, Haofei Lu, Jens Lundell, Danica Kragic

    Abstract: This work addresses the problem of learning approach-constrained data-driven grasp samplers. To this end, we propose GoNet: a generative grasp sampler that can constrain the grasp approach direction to a subset of SO(3). The key insight is to discretize SO(3) into a predefined number of bins and train GoNet to generate grasps whose approach directions are within those bins. At run-time, the bin al… ▽ More

    Submitted 25 October, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: IEEE-RAS International Conference on Humanoid Robots (Humanoids 2023)