Skip to main content

Showing 1–50 of 58 results for author: Johns, E

.
  1. arXiv:2503.06831  [pdf, other

    cs.RO cs.CV

    One-Shot Dual-Arm Imitation Learning

    Authors: Yilong Wang, Edward Johns

    Abstract: We introduce One-Shot Dual-Arm Imitation Learning (ODIL), which enables dual-arm robots to learn precise and coordinated everyday tasks from just a single demonstration of the task. ODIL uses a new three-stage visual servoing (3-VS) method for precise alignment between the end-effector and target object, after which replay of the demonstration trajectory is sufficient to perform the task. This is… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

    Comments: Accepted at ICRA 2025. Project Webpage: https://www.robot-learning.uk/one-shot-dual-arm

  2. arXiv:2411.12633  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Instant Policy: In-Context Imitation Learning via Graph Diffusion

    Authors: Vitalis Vosylius, Edward Johns

    Abstract: Following the impressive capabilities of in-context learning with large transformers, In-Context Imitation Learning (ICIL) is a promising opportunity for robotics. We introduce Instant Policy, which learns new tasks instantly (without further training) from just one or two demonstrations, achieving ICIL through two key components. First, we introduce inductive biases through a graph representation… ▽ More

    Submitted 25 April, 2025; v1 submitted 19 November, 2024; originally announced November 2024.

    Comments: Code and videos are available on our project webpage at https://www.robot-learning.uk/instant-policy

  3. arXiv:2410.19693  [pdf, other

    cs.RO cs.AI cs.LG

    MILES: Making Imitation Learning Easy with Self-Supervision

    Authors: Georgios Papagiannis, Edward Johns

    Abstract: Data collection in imitation learning often requires significant, laborious human supervision, such as numerous demonstrations, and/or frequent environment resets for methods that incorporate reinforcement learning. In this work, we propose an alternative approach, MILES: a fully autonomous, self-supervised data collection paradigm, and we show that this enables efficient policy learning from just… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

    Comments: Published at the Conference on Robot Learning (CoRL) 2024

  4. arXiv:2408.00178  [pdf, other

    cs.RO cs.LG

    Adapting Skills to Novel Grasps: A Self-Supervised Approach

    Authors: Georgios Papagiannis, Kamil Dreczkowski, Vitalis Vosylius, Edward Johns

    Abstract: In this paper, we study the problem of adapting manipulation trajectories involving grasped objects (e.g. tools) defined for a single grasp pose to novel grasp poses. A common approach to address this is to define a new trajectory for each possible grasp explicitly, but this is highly inefficient. Instead, we propose a method to adapt such trajectories directly while only requiring a period of sel… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

    Comments: Accepted at IROS 2024

  5. arXiv:2407.12957  [pdf, other

    cs.RO cs.LG

    R+X: Retrieval and Execution from Everyday Human Videos

    Authors: Georgios Papagiannis, Norman Di Palo, Pietro Vitiello, Edward Johns

    Abstract: We present R+X, a framework which enables robots to learn skills from long, unlabelled, first-person videos of humans performing everyday tasks. Given a language command from a human, R+X first retrieves short video clips containing relevant behaviour, and then executes the skill by conditioning an in-context imitation learning method (KAT) on this behaviour. By leveraging a Vision Language Model… ▽ More

    Submitted 3 April, 2025; v1 submitted 17 July, 2024; originally announced July 2024.

    Comments: Published at the IEEE International Conference on Robotics and Automation (ICRA) 2025

  6. arXiv:2404.00117  [pdf, other

    physics.bio-ph cond-mat.soft

    Spectral approaches to stress relaxation in epithelial monolayers

    Authors: Natasha Cowley, Christopher K. Revell, Emma Johns, Sarah Woolner, Oliver E. Jensen

    Abstract: We investigate the viscoelastic relaxation to equilibrium of a disordered planar epithelium described using the cell vertex model. In its standard form, the model is formulated as coupled evolution equations for the locations of vertices of confluent polygonal cells. Exploiting the model's gradient-flow structure, we use singular-value decomposition to project modes of deformation of vertices onto… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

    Comments: 8 figures

  7. arXiv:2403.19578  [pdf, other

    cs.RO cs.LG cs.NE

    Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics

    Authors: Norman Di Palo, Edward Johns

    Abstract: We show that off-the-shelf text-based Transformers, with no additional training, can perform few-shot in-context visual imitation learning, mapping visual observations to action sequences that emulate the demonstrator's behaviour. We achieve this by transforming visual observations (inputs) and trajectories of actions (outputs) into sequences of tokens that a text-pretrained Transformer (GPT-4 Tur… ▽ More

    Submitted 17 October, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Published at Robotics: Science and Systems (RSS) 2024

  8. arXiv:2402.13181  [pdf, other

    cs.RO cs.LG

    DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models

    Authors: Norman Di Palo, Edward Johns

    Abstract: We propose DINOBot, a novel imitation learning framework for robot manipulation, which leverages the image-level and pixel-level capabilities of features extracted from Vision Transformers trained with DINO. When interacting with a novel object, DINOBot first uses these features to retrieve the most visually similar object experienced during human demonstrations, and then uses this object to align… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: To appear at 2024 IEEE International Conference on Robotics and Automation (ICRA)

  9. arXiv:2312.12345  [pdf, other

    cs.RO cs.LG

    On the Effectiveness of Retrieval, Alignment, and Replay in Manipulation

    Authors: Norman Di Palo, Edward Johns

    Abstract: Imitation learning with visual observations is notoriously inefficient when addressed with end-to-end behavioural cloning methods. In this paper, we explore an alternative paradigm which decomposes reasoning into three phases. First, a retrieval phase, which informs the robot what it can do with an object. Second, an alignment phase, which informs the robot where to interact with the object. And t… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Published in IEEE Robotics and Automation Letters (RA-L). (Accepted December 2023)

  10. arXiv:2312.10807  [pdf, other

    cs.RO

    Bridging Language and Action: A Survey of Language-Conditioned Robot Manipulation

    Authors: Hongkuan Zhou, Xiangtong Yao, Oier Mees, Yuan Meng, Ted Xiao, Yonatan Bisk, Jean Oh, Edward Johns, Mohit Shridhar, Dhruv Shah, Jesse Thomason, Kai Huang, Joyce Chai, Zhenshan Bing, Alois Knoll

    Abstract: Language-conditioned robot manipulation is an emerging field aimed at enabling seamless communication and cooperation between humans and robotic agents by teaching robots to comprehend and execute instructions conveyed in natural language. This interdisciplinary area integrates scene understanding, language processing, and policy learning to bridge the gap between human instructions and robotic ac… ▽ More

    Submitted 17 February, 2025; v1 submitted 17 December, 2023; originally announced December 2023.

    Comments: 37 pages, 15 figures, 4 tables, 354 citations

  11. arXiv:2312.04533  [pdf, other

    cs.RO cs.CV cs.LG

    Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models

    Authors: Ivan Kapelyukh, Yifei Ren, Ignacio Alzugaray, Edward Johns

    Abstract: We introduce Dream2Real, a robotics framework which integrates vision-language models (VLMs) trained on 2D data into a 3D object rearrangement pipeline. This is achieved by the robot autonomously constructing a 3D representation of the scene, where objects can be rearranged virtually and an image of the resulting arrangement rendered. These renders are evaluated by a VLM, so that the arrangement w… ▽ More

    Submitted 29 July, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: ICRA 2024. Project webpage with robot videos: https://www.robot-learning.uk/dream2real

  12. arXiv:2311.08530  [pdf, other

    cs.RO cs.CV cs.LG

    SceneScore: Learning a Cost Function for Object Arrangement

    Authors: Ivan Kapelyukh, Edward Johns

    Abstract: Arranging objects correctly is a key capability for robots which unlocks a wide range of useful tasks. A prerequisite for creating successful arrangements is the ability to evaluate the desirability of a given arrangement. Our method "SceneScore" learns a cost function for arrangements, such that desirable, human-like arrangements have a low cost. We learn the distribution of training arrangements… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: Presented at CoRL 2023 LEAP Workshop. Webpage: https://sites.google.com/view/scenescore

  13. arXiv:2310.12238  [pdf, other

    cs.RO cs.AI cs.LG

    Few-Shot In-Context Imitation Learning via Implicit Graph Alignment

    Authors: Vitalis Vosylius, Edward Johns

    Abstract: Consider the following problem: given a few demonstrations of a task across a few different objects, how can a robot learn to perform that same task on new, previously unseen objects? This is challenging because the large variety of objects within a class makes it difficult to infer the task-relevant relationship between the new objects and the objects in the demonstrations. We address this by for… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: Published at CoRL 2023. Videos are available on our project webpage at https://www.robot-learning.uk/implicit-graph-alignment

  14. arXiv:2310.12077  [pdf, other

    cs.RO cs.CV cs.LG

    One-Shot Imitation Learning: A Pose Estimation Perspective

    Authors: Pietro Vitiello, Kamil Dreczkowski, Edward Johns

    Abstract: In this paper, we study imitation learning under the challenging setting of: (1) only a single demonstration, (2) no further data collection, and (3) no prior task or object knowledge. We show how, with these constraints, imitation learning can be formulated as a combination of trajectory transfer and unseen object pose estimation. To explore this idea, we provide an in-depth study on how state-of… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: Published at the 7th Conference on Robot Learning (CoRL 2023). For more details please visit https://www.robot-learning.uk/pose-estimation-perspective

  15. arXiv:2310.11604  [pdf, other

    cs.RO cs.AI cs.CL cs.HC cs.LG

    Language Models as Zero-Shot Trajectory Generators

    Authors: Teyun Kwon, Norman Di Palo, Edward Johns

    Abstract: Large Language Models (LLMs) have recently shown promise as high-level planners for robots when given access to a selection of low-level skills. However, it is often assumed that LLMs do not possess sufficient knowledge to be used for the low-level trajectories themselves. In this work, we address this assumption thoroughly, and investigate if an LLM (GPT-4) can directly predict a dense sequence o… ▽ More

    Submitted 17 June, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Published in IEEE Robotics and Automation Letters (Volume: 9, Issue: 7, July 2024, Pages: 6728-6735); 10 pages, 12 figures

    Journal ref: IEEE Robotics and Automation Letters (Volume: 9, Issue: 7, July 2024, Pages: 6728-6735)

  16. arXiv:2310.08864  [pdf, other

    cs.RO

    Open X-Embodiment: Robotic Learning Datasets and RT-X Models

    Authors: Open X-Embodiment Collaboration, Abby O'Neill, Abdul Rehman, Abhinav Gupta, Abhiram Maddukuri, Abhishek Gupta, Abhishek Padalkar, Abraham Lee, Acorn Pooley, Agrim Gupta, Ajay Mandlekar, Ajinkya Jain, Albert Tung, Alex Bewley, Alex Herzog, Alex Irpan, Alexander Khazatsky, Anant Rai, Anchit Gupta, Andrew Wang, Andrey Kolobov, Anikait Singh, Animesh Garg, Aniruddha Kembhavi, Annie Xie , et al. (269 additional authors not shown)

    Abstract: Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning method… ▽ More

    Submitted 14 May, 2025; v1 submitted 13 October, 2023; originally announced October 2023.

    Comments: Project website: https://robotics-transformer-x.github.io

  17. arXiv:2305.09870   

    cs.RO

    Crossing the Reality Gap in Tactile-Based Learning

    Authors: Ya-Yen Tsai, Bidan Huang, Yu Zheng, Lei Han, Wang Wei Lee, Edward Johns

    Abstract: Tactile sensors are believed to be essential in robotic manipulation, and prior works often rely on experts to reason the sensor feedback and design a controller. With the recent advancement in data-driven approaches, complicated manipulation can be realised, but an accurate and efficient tactile simulation is necessary for policy training. To this end, we present an approach to model a commonly u… ▽ More

    Submitted 22 May, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: This work requires further improvement

  18. arXiv:2303.02506  [pdf, other

    cs.LG cs.AI cs.CV

    Prismer: A Vision-Language Model with Multi-Task Experts

    Authors: Shikun Liu, Linxi Fan, Edward Johns, Zhiding Yu, Chaowei Xiao, Anima Anandkumar

    Abstract: Recent vision-language models have shown impressive multi-modal generation capabilities. However, typically they require training huge models on massive datasets. As a more scalable alternative, we introduce Prismer, a data- and parameter-efficient vision-language model that leverages an ensemble of task-specific experts. Prismer only requires training of a small number of components, with the maj… ▽ More

    Submitted 18 January, 2024; v1 submitted 4 March, 2023; originally announced March 2023.

    Comments: Published at TMLR 2024. Project Page: https://shikun.io/projects/prismer Code: https://github.com/NVlabs/prismer

  19. arXiv:2212.06111  [pdf, other

    cs.RO cs.LG

    Where To Start? Transferring Simple Skills to Complex Environments

    Authors: Vitalis Vosylius, Edward Johns

    Abstract: Robot learning provides a number of ways to teach robots simple skills, such as grasping. However, these skills are usually trained in open, clutter-free environments, and therefore would likely cause undesirable collisions in more complex, cluttered environments. In this work, we introduce an affordance model based on a graph representation of an environment, which is optimised during deployment… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

    Comments: Accepted at CoRL 2022. Videos are available on our project webpage at https://www.robot-learning.uk/where-to-start

  20. arXiv:2210.17325  [pdf, other

    cs.RO cs.CV

    Real-time Mapping of Physical Scene Properties with an Autonomous Robot Experimenter

    Authors: Iain Haughton, Edgar Sucar, Andre Mouton, Edward Johns, Andrew J. Davison

    Abstract: Neural fields can be trained from scratch to represent the shape and appearance of 3D scenes efficiently. It has also been shown that they can densely map correlated properties such as semantics, via sparse interactions from a human labeller. In this work, we show that a robot can densely annotate a scene with arbitrary discrete or continuous physical properties via its own fully-autonomous experi… ▽ More

    Submitted 31 October, 2022; originally announced October 2022.

  21. arXiv:2210.02438  [pdf, other

    cs.RO cs.CV cs.LG

    DALL-E-Bot: Introducing Web-Scale Diffusion Models to Robotics

    Authors: Ivan Kapelyukh, Vitalis Vosylius, Edward Johns

    Abstract: We introduce the first work to explore web-scale diffusion models for robotics. DALL-E-Bot enables a robot to rearrange objects in a scene, by first inferring a text description of those objects, then generating an image representing a natural, human-like arrangement of those objects, and finally physically arranging the objects according to that goal image. We show that this is possible zero-shot… ▽ More

    Submitted 4 May, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: Webpage and videos: ( https://www.robot-learning.uk/dall-e-bot ) Published in IEEE Robotics and Automation Letters (RA-L)

  22. Lagrangian coherence and source of water of Loop Current Frontal Eddies in the Gulf of Mexico

    Authors: Luna Hiron, Philippe Miron, Lynn K. Shay, William E. Johns, Eric P. Chassignet, Alexandra Bozec

    Abstract: Loop Current Frontal Eddies (LCFEs) are known to intensify and assist in the Loop Current (LC) eddy shedding. These eddies can also modify the circulation in the eastern Gulf of Mexico (GoM) by attracting water and passive tracers such as chlorophyll and pollutants to the LC-LCFE front. During the 2010 Deepwater Horizon oil spill, part of the oil was entrained not only in the LC-LCFE front but als… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

    Journal ref: Progress in Oceanography, 102876, ISSN 0079-6611 (2022)

  23. arXiv:2208.11658  [pdf, other

    cs.CV cs.AI

    AGO-Net: Association-Guided 3D Point Cloud Object Detection Network

    Authors: Liang Du, Xiaoqing Ye, Xiao Tan, Edward Johns, Bo Chen, Errui Ding, Xiangyang Xue, Jianfeng Feng

    Abstract: The human brain can effortlessly recognize and localize objects, whereas current 3D object detection methods based on LiDAR point clouds still report inferior performance for detecting occluded and distant objects: the point cloud appearance varies greatly due to occlusion, and has inherent variance in point densities along the distance to sensors. Therefore, designing feature representations robu… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

    Comments: 12 pages

  24. arXiv:2204.02863  [pdf, other

    cs.RO cs.AI cs.CV

    Demonstrate Once, Imitate Immediately (DOME): Learning Visual Servoing for One-Shot Imitation Learning

    Authors: Eugene Valassakis, Georgios Papagiannis, Norman Di Palo, Edward Johns

    Abstract: We present DOME, a novel method for one-shot imitation learning, where a task can be learned from just a single demonstration and then be deployed immediately, without any further data collection or training. DOME does not require prior task or object knowledge, and can perform the task in novel object configurations and with distractors. At its core, DOME uses an image-conditioned object segmenta… ▽ More

    Submitted 27 July, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

    Comments: To be published at IROS 2022. 7 figures, 8 pages. Videos and supplementary material are available at: https://www.robot-learning.uk/dome

  25. arXiv:2202.03091  [pdf, other

    cs.LG cs.AI cs.CV

    Auto-Lambda: Disentangling Dynamic Task Relationships

    Authors: Shikun Liu, Stephen James, Andrew J. Davison, Edward Johns

    Abstract: Understanding the structure of multiple related tasks allows for multi-task learning to improve the generalisation ability of one or all of them. However, it usually requires training each pairwise combination of tasks together in order to capture task relationships, at an extremely high computational cost. In this work, we learn task relationships via an automated weighting framework, named Auto-… ▽ More

    Submitted 2 June, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: Published at TMLR 2022. Project Page: https://shikun.io/projects/auto-lambda Code: https://github.com/lorenmt/auto-lambda

  26. arXiv:2111.12867  [pdf, other

    cs.RO cs.LG

    Back to Reality for Imitation Learning

    Authors: Edward Johns

    Abstract: Imitation learning, and robot learning in general, emerged due to breakthroughs in machine learning, rather than breakthroughs in robotics. As such, evaluation metrics for robot learning are deeply rooted in those for machine learning, and focus primarily on data efficiency. We believe that a better metric for real-world robot learning is time efficiency, which better models the true cost to human… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

    Comments: Published at CoRL 2021, blue sky oral track

  27. arXiv:2111.07447  [pdf, other

    cs.RO cs.LG

    Learning Multi-Stage Tasks with One Demonstration via Self-Replay

    Authors: Norman Di Palo, Edward Johns

    Abstract: In this work, we introduce a novel method to learn everyday-like multi-stage tasks from a single human demonstration, without requiring any prior object knowledge. Inspired by the recent Coarse-to-Fine Imitation Learning method, we model imitation learning as a learned object reaching phase followed by an open-loop replay of the demonstrator's actions. We build upon this for multi-stage tasks wher… ▽ More

    Submitted 14 November, 2021; originally announced November 2021.

    Comments: Published at the 5th Conference on Robot Learning (CoRL) 2021

  28. arXiv:2111.03112  [pdf, other

    cs.RO cs.LG

    My House, My Rules: Learning Tidying Preferences with Graph Neural Networks

    Authors: Ivan Kapelyukh, Edward Johns

    Abstract: Robots that arrange household objects should do so according to the user's preferences, which are inherently subjective and difficult to model. We present NeatNet: a novel Variational Autoencoder architecture using Graph Neural Network layers, which can extract a low-dimensional latent preference vector from a user by observing how they arrange scenes. Given any set of objects, this vector can the… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

    Comments: Published at CoRL 2021. Webpage and video: https://www.robot-learning.uk/my-house-my-rules

  29. arXiv:2111.01245  [pdf, other

    cs.RO cs.CV cs.LG

    Learning Eye-in-Hand Camera Calibration from a Single Image

    Authors: Eugene Valassakis, Kamil Dreczkowski, Edward Johns

    Abstract: Eye-in-hand camera calibration is a fundamental and long-studied problem in robotics. We present a study on using learning-based methods for solving this problem online from a single RGB image, whilst training our models with entirely synthetic data. We study three main approaches: one direct regression model that directly predicts the extrinsic matrix from an image, one sparse correspondence mode… ▽ More

    Submitted 3 November, 2021; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: Published at the 2021 Conference on Robot Learning (CoRL). Webpage and video: https://www.robot-learning.uk/learning-eye-in-hand-calibration

  30. arXiv:2109.07559  [pdf, other

    cs.CV cs.RO

    Hybrid ICP

    Authors: Kamil Dreczkowski, Edward Johns

    Abstract: ICP algorithms typically involve a fixed choice of data association method and a fixed choice of error metric. In this paper, we propose Hybrid ICP, a novel and flexible ICP variant which dynamically optimises both the data association method and error metric based on the live image of an object and the current ICP estimate. We show that when used for object pose estimation, Hybrid ICP is more acc… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

    Comments: Published at IROS 2021. Webpage and video: https://www.robot-learning.uk/hybrid-icp

  31. arXiv:2105.11283  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Coarse-to-Fine for Sim-to-Real: Sub-Millimetre Precision Across Wide Task Spaces

    Authors: Eugene Valassakis, Norman Di Palo, Edward Johns

    Abstract: In this paper, we study the problem of zero-shot sim-to-real when the task requires both highly precise control with sub-millimetre error tolerance, and wide task space generalisation. Our framework involves a coarse-to-fine controller, where trajectories begin with classical motion planning using ICP-based pose estimation, and transition to a learned end-to-end controller which maps images to act… ▽ More

    Submitted 29 July, 2021; v1 submitted 24 May, 2021; originally announced May 2021.

    Comments: To be published at IROS 2021. 8 pages, 6 figures

  32. arXiv:2105.06411  [pdf, other

    cs.RO cs.LG

    Coarse-to-Fine Imitation Learning: Robot Manipulation from a Single Demonstration

    Authors: Edward Johns

    Abstract: We introduce a simple new method for visual imitation learning, which allows a novel robot manipulation task to be learned from a single human demonstration, without requiring any prior knowledge of the object being interacted with. Our method models imitation learning as a state estimation problem, with the state defined as the end-effector's pose at the point where object interaction begins, as… ▽ More

    Submitted 10 June, 2021; v1 submitted 13 May, 2021; originally announced May 2021.

    Comments: Published at ICRA 2021. Webpage and video: https://www.robot-learning.uk/coarse-to-fine-imitation-learning

  33. arXiv:2104.04465  [pdf, other

    cs.CV cs.LG

    Bootstrapping Semantic Segmentation with Regional Contrast

    Authors: Shikun Liu, Shuaifeng Zhi, Edward Johns, Andrew J. Davison

    Abstract: We present ReCo, a contrastive learning framework designed at a regional level to assist learning in semantic segmentation. ReCo performs semi-supervised or supervised pixel-level contrastive learning on a sparse set of hard negative pixels, with minimal additional memory footprint. ReCo is easy to implement, being built on top of off-the-shelf segmentation networks, and consistently improves perf… ▽ More

    Submitted 31 January, 2022; v1 submitted 9 April, 2021; originally announced April 2021.

    Comments: Published at ICLR 2022. Project Page: https://shikun.io/projects/regional-contrast. Code: https://github.com/lorenmt/reco

  34. arXiv:2102.11003  [pdf, other

    cs.RO

    DROID: Minimizing the Reality Gap using Single-Shot Human Demonstration

    Authors: Ya-Yen Tsai, Hui Xu, Zihan Ding, Chong Zhang, Edward Johns, Bidan Huang

    Abstract: Reinforcement learning (RL) has demonstrated great success in the past several years. However, most of the scenarios focus on simulated environments. One of the main challenges of transferring the policy learned in a simulated environment to real world, is the discrepancy between the dynamics of the two environments. In prior works, Domain Randomization (DR) has been used to address the reality ga… ▽ More

    Submitted 23 February, 2021; v1 submitted 22 February, 2021; originally announced February 2021.

    Comments: paper accepted and to be published in RA-L 2021

  35. arXiv:2011.09586  [pdf, other

    cs.RO cs.LG

    SAFARI: Safe and Active Robot Imitation Learning with Imagination

    Authors: Norman Di Palo, Edward Johns

    Abstract: One of the main issues in Imitation Learning is the erroneous behavior of an agent when facing out-of-distribution situations, not covered by the set of demonstrations given by the expert. In this work, we tackle this problem by introducing a novel active learning and control algorithm, SAFARI. During training, it allows an agent to request further human demonstrations when these out-of-distributi… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

  36. arXiv:2011.07112  [pdf, other

    cs.RO cs.CV

    Benchmarking Domain Randomisation for Visual Sim-to-Real Transfer

    Authors: Raghad Alghonaim, Edward Johns

    Abstract: Domain randomisation is a very popular method for visual sim-to-real transfer in robotics, due to its simplicity and ability to achieve transfer without any real-world images at all. Nonetheless, a number of design choices must be made to achieve optimal transfer. In this paper, we perform a comprehensive benchmarking study on these different choices, with two key experiments evaluated on a real-w… ▽ More

    Submitted 21 May, 2021; v1 submitted 13 November, 2020; originally announced November 2020.

    Comments: Published at ICRA 2021. For project page, please visit: https://www.robot-learning.uk/benchmarking-domain-randomisation

  37. arXiv:2008.06686  [pdf, other

    cs.RO cs.LG

    Crossing The Gap: A Deep Dive into Zero-Shot Sim-to-Real Transfer for Dynamics

    Authors: Eugene Valassakis, Zihan Ding, Edward Johns

    Abstract: Zero-shot sim-to-real transfer of tasks with complex dynamics is a highly challenging and unsolved problem. A number of solutions have been proposed in recent years, but we have found that many works do not present a thorough evaluation in the real world, or underplay the significant engineering effort and task-specific fine tuning that is required to achieve the published results. In this paper,… ▽ More

    Submitted 15 August, 2020; originally announced August 2020.

    Comments: To be published at IROS 2020. 8 pages, 6 figures. For supplementary material and code, please visit : https://www.robot-learning.uk/crossing-the-gap

  38. arXiv:2008.03285  [pdf, other

    cs.CV cs.RO

    Physics-Based Dexterous Manipulations with Estimated Hand Poses and Residual Reinforcement Learning

    Authors: Guillermo Garcia-Hernando, Edward Johns, Tae-Kyun Kim

    Abstract: Dexterous manipulation of objects in virtual environments with our bare hands, by using only a depth sensor and a state-of-the-art 3D hand pose estimator (HPE), is challenging. While virtual environments are ruled by physics, e.g. object weights and surface frictions, the absence of force feedback makes the task challenging, as even slight inaccuracies on finger tips or contact points from HPE may… ▽ More

    Submitted 7 August, 2020; originally announced August 2020.

    Comments: To appear in IROS2020

  39. arXiv:2008.00892  [pdf, other

    cs.LG cs.CV stat.ML

    Shape Adaptor: A Learnable Resizing Module

    Authors: Shikun Liu, Zhe Lin, Yilin Wang, Jianming Zhang, Federico Perazzi, Edward Johns

    Abstract: We present a novel resizing module for neural networks: shape adaptor, a drop-in enhancement built on top of traditional resizing layers, such as pooling, bilinear sampling, and strided convolution. Whilst traditional resizing layers have fixed and deterministic reshaping factors, our module allows for a learnable reshaping factor. Our implementation enables shape adaptors to be trained end-to-end… ▽ More

    Submitted 10 August, 2020; v1 submitted 3 August, 2020; originally announced August 2020.

    Comments: Published at ECCV 2020

  40. arXiv:2004.00716  [pdf, other

    cs.RO cs.AI cs.LG

    Constrained-Space Optimization and Reinforcement Learning for Complex Tasks

    Authors: Ya-Yen Tsai, Bo Xiao, Edward Johns, Guang-Zhong Yang

    Abstract: Learning from Demonstration is increasingly used for transferring operator manipulation skills to robots. In practice, it is important to cater for limited data and imperfect human demonstrations, as well as underlying safety constraints. This paper presents a constrained-space optimization and reinforcement learning scheme for managing complex tasks. Through interactions within the constrained sp… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

    Comments: Accepted for publication in RA-Letters and at ICRA 2020

    Journal ref: IEEE Robotics and Automation Letters, 5(2) (2020) 682-689

  41. arXiv:2004.00136  [pdf, other

    cs.RO

    Sim-to-Real Transfer for Optical Tactile Sensing

    Authors: Zihan Ding, Nathan F. Lepora, Edward Johns

    Abstract: Deep learning and reinforcement learning methods have been shown to enable learning of flexible and complex robot controllers. However, the reliance on large amounts of training data often requires data collection to be carried out in simulation, with a number of sim-to-real transfer methods being developed in recent years. In this paper, we study these techniques for tactile sensing using the Tac… ▽ More

    Submitted 31 March, 2020; originally announced April 2020.

    Comments: Accepted for publication at ICRA 2020. Website: https://www.robot-learning.uk/sim-to-real-tactile-icra-2020

  42. arXiv:1910.10799  [pdf, other

    physics.bio-ph cond-mat.dis-nn q-bio.CB

    Force networks, torque balance and Airy stress in the planar vertex model of a confluent epithelium

    Authors: Oliver E. Jensen, Emma Johns, Sarah Woolner

    Abstract: The vertex model is a popular framework for modelling tightly packed biological cells, such as confluent epithelia. Cells are described by convex polygons tiling the plane and their equilibrium is found by minimizing a global mechanical energy, with vertex locations treated as degrees of freedom. Drawing on analogies with granular materials, we describe the force network for a localized monolayer… ▽ More

    Submitted 2 April, 2020; v1 submitted 23 October, 2019; originally announced October 2019.

  43. arXiv:1901.08933  [pdf, other

    cs.LG cs.CV stat.ML

    Self-Supervised Generalisation with Meta Auxiliary Learning

    Authors: Shikun Liu, Andrew J. Davison, Edward Johns

    Abstract: Learning with auxiliary tasks can improve the ability of a primary task to generalise. However, this comes at the cost of manually labelling auxiliary data. We propose a new method which automatically learns appropriate labels for an auxiliary task, such that any supervised learning task can be improved without requiring access to any further data. The approach is to train two neural networks: a l… ▽ More

    Submitted 26 November, 2019; v1 submitted 25 January, 2019; originally announced January 2019.

    Comments: Published at Conference on Neural Information Processing Systems 2019

  44. arXiv:1803.10704  [pdf, other

    cs.CV

    End-to-End Multi-Task Learning with Attention

    Authors: Shikun Liu, Edward Johns, Andrew J. Davison

    Abstract: We propose a novel multi-task learning architecture, which allows learning of task-specific feature-level attention. Our design, the Multi-Task Attention Network (MTAN), consists of a single shared network containing a global feature pool, together with a soft-attention module for each task. These modules allow for learning of task-specific features from the global features, whilst simultaneously… ▽ More

    Submitted 5 April, 2019; v1 submitted 28 March, 2018; originally announced March 2018.

    Comments: Accepted at Computer Vision and Pattern Recognition (CVPR), 2019

  45. arXiv:1711.02909  [pdf, other

    q-bio.CB physics.bio-ph

    Mechanical characterization of disordered and anisotropic cellular monolayers

    Authors: Alexander Nestor-Bergmann, Emma Johns, Sarah Woolner, Oliver E. Jensen

    Abstract: We consider a cellular monolayer, described using a vertex-based model, for which cells form a spatially disordered array of convex polygons that tile the plane. Equilibrium cell configurations are assumed to minimize a global energy defined in terms of cell areas and perimeters; energy is dissipated via dynamic area and length changes, as well as cell neighbour exchanges. The model captures our o… ▽ More

    Submitted 24 February, 2018; v1 submitted 8 November, 2017; originally announced November 2017.

    Comments: 9 figures

    Journal ref: Phys. Rev. E 97, 052409 (2018)

  46. arXiv:1707.02267  [pdf, other

    cs.RO cs.LG

    Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task

    Authors: Stephen James, Andrew J. Davison, Edward Johns

    Abstract: End-to-end control for robot manipulation and grasping is emerging as an attractive alternative to traditional pipelined approaches. However, end-to-end methods tend to either be slow to train, exhibit little or no generalisability, or lack the ability to accomplish long-horizon or multi-stage tasks. In this paper, we show how two simple techniques can lead to end-to-end (image to velocity) execut… ▽ More

    Submitted 17 October, 2017; v1 submitted 7 July, 2017; originally announced July 2017.

    Comments: 1st Conference on Robot Learning (CoRL 2017), Mountain View, United States

  47. arXiv:1705.08260  [pdf

    cs.CV cs.RO

    Self-Supervised Siamese Learning on Stereo Image Pairs for Depth Estimation in Robotic Surgery

    Authors: Menglong Ye, Edward Johns, Ankur Handa, Lin Zhang, Philip Pratt, Guang-Zhong Yang

    Abstract: Robotic surgery has become a powerful tool for performing minimally invasive procedures, providing advantages in dexterity, precision, and 3D vision, over traditional surgery. One popular robotic system is the da Vinci surgical platform, which allows preoperative information to be incorporated into live procedures using Augmented Reality (AR). Scene depth estimation is a prerequisite for AR, as ac… ▽ More

    Submitted 17 May, 2017; originally announced May 2017.

    Comments: A two-page short report to be presented at the Hamlyn Symposium on Medical Robotics 2017. An extension of this work is on progress

  48. arXiv:1609.03759  [pdf, other

    cs.RO cs.CV cs.LG

    3D Simulation for Robot Arm Control with Deep Q-Learning

    Authors: Stephen James, Edward Johns

    Abstract: Recent trends in robot arm control have seen a shift towards end-to-end solutions, using deep reinforcement learning to learn a controller directly from raw sensor data, rather than relying on a hand-crafted, modular pipeline. However, the high dimensionality of the state space often means that it is impractical to generate sufficient training data with real-world experiments. As an alternative so… ▽ More

    Submitted 13 December, 2016; v1 submitted 13 September, 2016; originally announced September 2016.

    Comments: In NIPS 2016 Workshop: Deep Learning for Action and Interaction (https://sites.google.com/site/nips16interaction/)

  49. arXiv:1608.02239  [pdf, other

    cs.RO cs.CV cs.LG

    Deep Learning a Grasp Function for Grasping under Gripper Pose Uncertainty

    Authors: Edward Johns, Stefan Leutenegger, Andrew J. Davison

    Abstract: This paper presents a new method for parallel-jaw grasping of isolated objects from depth images, under large gripper pose uncertainty. Whilst most approaches aim to predict the single best grasp pose from an image, our method first predicts a score for every possible grasp pose, which we denote the grasp function. With this, it is possible to achieve grasping robust to the gripper's pose uncertai… ▽ More

    Submitted 7 August, 2016; originally announced August 2016.

    Comments: IROS 2016

  50. arXiv:1605.08359  [pdf, other

    cs.CV cs.RO

    Pairwise Decomposition of Image Sequences for Active Multi-View Recognition

    Authors: Edward Johns, Stefan Leutenegger, Andrew J. Davison

    Abstract: A multi-view image sequence provides a much richer capacity for object recognition than from a single image. However, most existing solutions to multi-view recognition typically adopt hand-crafted, model-based geometric methods, which do not readily embrace recent trends in deep learning. We propose to bring Convolutional Neural Networks to generic multi-view recognition, by decomposing an image s… ▽ More

    Submitted 26 May, 2016; originally announced May 2016.

    Comments: CVPR 2016 (oral)