Search | arXiv e-print repository

arXiv:2410.19693 [pdf, other]

MILES: Making Imitation Learning Easy with Self-Supervision

Authors: Georgios Papagiannis, Edward Johns

Abstract: Data collection in imitation learning often requires significant, laborious human supervision, such as numerous demonstrations, and/or frequent environment resets for methods that incorporate reinforcement learning. In this work, we propose an alternative approach, MILES: a fully autonomous, self-supervised data collection paradigm, and we show that this enables efficient policy learning from just… ▽ More Data collection in imitation learning often requires significant, laborious human supervision, such as numerous demonstrations, and/or frequent environment resets for methods that incorporate reinforcement learning. In this work, we propose an alternative approach, MILES: a fully autonomous, self-supervised data collection paradigm, and we show that this enables efficient policy learning from just a single demonstration and a single environment reset. MILES autonomously learns a policy for returning to and then following the single demonstration, whilst being self-guided during data collection, eliminating the need for additional human interventions. We evaluated MILES across several real-world tasks, including tasks that require precise contact-rich manipulation such as locking a lock with a key. We found that, under the constraints of a single demonstration and no repeated environment resetting, MILES significantly outperforms state-of-the-art alternatives like imitation learning methods that leverage reinforcement learning. Videos of our experiments and code can be found on our webpage: www.robot-learning.uk/miles. △ Less

Submitted 25 October, 2024; originally announced October 2024.

Comments: Published at the Conference on Robot Learning (CoRL) 2024

arXiv:2408.00178 [pdf, other]

Adapting Skills to Novel Grasps: A Self-Supervised Approach

Authors: Georgios Papagiannis, Kamil Dreczkowski, Vitalis Vosylius, Edward Johns

Abstract: In this paper, we study the problem of adapting manipulation trajectories involving grasped objects (e.g. tools) defined for a single grasp pose to novel grasp poses. A common approach to address this is to define a new trajectory for each possible grasp explicitly, but this is highly inefficient. Instead, we propose a method to adapt such trajectories directly while only requiring a period of sel… ▽ More In this paper, we study the problem of adapting manipulation trajectories involving grasped objects (e.g. tools) defined for a single grasp pose to novel grasp poses. A common approach to address this is to define a new trajectory for each possible grasp explicitly, but this is highly inefficient. Instead, we propose a method to adapt such trajectories directly while only requiring a period of self-supervised data collection, during which a camera observes the robot's end-effector moving with the object rigidly grasped. Importantly, our method requires no prior knowledge of the grasped object (such as a 3D CAD model), it can work with RGB images, depth images, or both, and it requires no camera calibration. Through a series of real-world experiments involving 1360 evaluations, we find that self-supervised RGB data consistently outperforms alternatives that rely on depth images including several state-of-the-art pose estimation methods. Compared to the best-performing baseline, our method results in an average of 28.5% higher success rate when adapting manipulation trajectories to novel grasps on several everyday tasks. Videos of the experiments are available on our webpage at https://www.robot-learning.uk/adapting-skills △ Less

Submitted 31 July, 2024; originally announced August 2024.

Comments: Accepted at IROS 2024

arXiv:2407.12957 [pdf, other]

R+X: Retrieval and Execution from Everyday Human Videos

Authors: Georgios Papagiannis, Norman Di Palo, Pietro Vitiello, Edward Johns

Abstract: We present R+X, a framework which enables robots to learn skills from long, unlabelled, first-person videos of humans performing everyday tasks. Given a language command from a human, R+X first retrieves short video clips containing relevant behaviour, and then executes the skill by conditioning an in-context imitation learning method (KAT) on this behaviour. By leveraging a Vision Language Model… ▽ More We present R+X, a framework which enables robots to learn skills from long, unlabelled, first-person videos of humans performing everyday tasks. Given a language command from a human, R+X first retrieves short video clips containing relevant behaviour, and then executes the skill by conditioning an in-context imitation learning method (KAT) on this behaviour. By leveraging a Vision Language Model (VLM) for retrieval, R+X does not require any manual annotation of the videos, and by leveraging in-context learning for execution, robots can perform commanded skills immediately, without requiring a period of training on the retrieved videos. Experiments studying a range of everyday household tasks show that R+X succeeds at translating unlabelled human videos into robust robot skills, and that R+X outperforms several recent alternative methods. Videos and code are available at https://www.robot-learning.uk/r-plus-x. △ Less

Submitted 3 April, 2025; v1 submitted 17 July, 2024; originally announced July 2024.

Comments: Published at the IEEE International Conference on Robotics and Automation (ICRA) 2025

arXiv:2204.02863 [pdf, other]

Demonstrate Once, Imitate Immediately (DOME): Learning Visual Servoing for One-Shot Imitation Learning

Authors: Eugene Valassakis, Georgios Papagiannis, Norman Di Palo, Edward Johns

Abstract: We present DOME, a novel method for one-shot imitation learning, where a task can be learned from just a single demonstration and then be deployed immediately, without any further data collection or training. DOME does not require prior task or object knowledge, and can perform the task in novel object configurations and with distractors. At its core, DOME uses an image-conditioned object segmenta… ▽ More We present DOME, a novel method for one-shot imitation learning, where a task can be learned from just a single demonstration and then be deployed immediately, without any further data collection or training. DOME does not require prior task or object knowledge, and can perform the task in novel object configurations and with distractors. At its core, DOME uses an image-conditioned object segmentation network followed by a learned visual servoing network, to move the robot's end-effector to the same relative pose to the object as during the demonstration, after which the task can be completed by replaying the demonstration's end-effector velocities. We show that DOME achieves near 100% success rate on 7 real-world everyday tasks, and we perform several studies to thoroughly understand each individual component of DOME. Videos and supplementary material are available at: https://www.robot-learning.uk/dome . △ Less

Submitted 27 July, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

Comments: To be published at IROS 2022. 7 figures, 8 pages. Videos and supplementary material are available at: https://www.robot-learning.uk/dome

arXiv:2011.05748 [pdf, other]

End-To-End Semi-supervised Learning for Differentiable Particle Filters

Authors: Hao Wen, Xiongjie Chen, Georgios Papagiannis, Conghui Hu, Yunpeng Li

Abstract: Recent advances in incorporating neural networks into particle filters provide the desired flexibility to apply particle filters in large-scale real-world applications. The dynamic and measurement models in this framework are learnable through the differentiable implementation of particle filters. Past efforts in optimising such models often require the knowledge of true states which can be expens… ▽ More Recent advances in incorporating neural networks into particle filters provide the desired flexibility to apply particle filters in large-scale real-world applications. The dynamic and measurement models in this framework are learnable through the differentiable implementation of particle filters. Past efforts in optimising such models often require the knowledge of true states which can be expensive to obtain or even unavailable in practice. In this paper, in order to reduce the demand for annotated data, we present an end-to-end learning objective based upon the maximisation of a pseudo-likelihood function which can improve the estimation of states when large portion of true states are unknown. We assess performance of the proposed method in state estimation tasks in robotics with simulated and real-world datasets. △ Less

Submitted 28 March, 2021; v1 submitted 11 November, 2020; originally announced November 2020.

Comments: Accepted in ICRA 2021

arXiv:2008.09167 [pdf, other]

Imitation Learning with Sinkhorn Distances

Authors: Georgios Papagiannis, Yunpeng Li

Abstract: Imitation learning algorithms have been interpreted as variants of divergence minimization problems. The ability to compare occupancy measures between experts and learners is crucial in their effectiveness in learning from demonstrations. In this paper, we present tractable solutions by formulating imitation learning as minimization of the Sinkhorn distance between occupancy measures. The formulat… ▽ More Imitation learning algorithms have been interpreted as variants of divergence minimization problems. The ability to compare occupancy measures between experts and learners is crucial in their effectiveness in learning from demonstrations. In this paper, we present tractable solutions by formulating imitation learning as minimization of the Sinkhorn distance between occupancy measures. The formulation combines the valuable properties of optimal transport metrics in comparing non-overlapping distributions with a cosine distance cost defined in an adversarially learned feature space. This leads to a highly discriminative critic network and optimal transport plan that subsequently guide imitation learning. We evaluate the proposed approach using both the reward metric and the Sinkhorn distance metric on a number of MuJoCo experiments. For the implementation and reproducing results please refer to the following repository https://github.com/gpapagiannis/sinkhorn-imitation. △ Less

Submitted 2 July, 2022; v1 submitted 20 August, 2020; originally announced August 2020.

Comments: Published as a conference paper at ECML PKDD 2022

Journal ref: ECML PKDD 2022

arXiv:1910.07576 [pdf, ps, other]

doi 10.1016/j.renene.2020.01.061

Assessing the viability of Battery Energy Storage Systems coupled with Photovoltaics under a pure self-consumption scheme

Authors: Georgios A. Barzegkar-Ntovom, Nikolas G. Chatzigeorgiou, Angelos I. Nousdilis, Styliani A. Vomva, Georgios C. Kryonidis, Eleftherios O. Kontis, George E. Georghiou, Georgios C. Christoforidis, Grigoris K. Papagiannis

Abstract: Over the last few decades, there is a constantly increasing deployment of solar photovoltaic (PV) systems both at the commercial and residential building sector. However, the steadily growing PV penetration poses several technical problems to electric power systems, mainly related to power quality issues. To this context, the exploitation of energy storage systems integrated along with PVs could c… ▽ More Over the last few decades, there is a constantly increasing deployment of solar photovoltaic (PV) systems both at the commercial and residential building sector. However, the steadily growing PV penetration poses several technical problems to electric power systems, mainly related to power quality issues. To this context, the exploitation of energy storage systems integrated along with PVs could constitute a possible solution. The scope of this paper is to thoroughly evaluate the economic viability of hybrid PV-and-Storage systems at the residential building level under a future pure self-consumption policy that provides no reimbursement for excess PV energy injected to the grid. For this purpose, an indicator referred to as the Levelized Cost of Use is utilized for the assessment of the competitiveness of hybrid PV-and-Storage systems in the energy market, considering various sizes of the hybrid system, battery energy storage costs and prosumer types for six Mediterranean countries. △ Less

Submitted 8 February, 2020; v1 submitted 16 October, 2019; originally announced October 2019.

Comments: 20 pages, 8 figures, Preprint submitted to Journal of Renewable Energy

arXiv:1909.03331 [pdf, other]

Deep Reinforcement Learning for Control of Probabilistic Boolean Networks

Authors: Georgios Papagiannis, Sotiris Moschoyiannis

Abstract: Probabilistic Boolean Networks (PBNs) were introduced as a computational model for the study of complex dynamical systems, such as Gene Regulatory Networks (GRNs). Controllability in this context is the process of making strategic interventions to the state of a network in order to drive it towards some other state that exhibits favourable biological properties. In this paper we study the ability… ▽ More Probabilistic Boolean Networks (PBNs) were introduced as a computational model for the study of complex dynamical systems, such as Gene Regulatory Networks (GRNs). Controllability in this context is the process of making strategic interventions to the state of a network in order to drive it towards some other state that exhibits favourable biological properties. In this paper we study the ability of a Double Deep Q-Network with Prioritized Experience Replay in learning control strategies within a finite number of time steps that drive a PBN towards a target state, typically an attractor. The control method is model-free and does not require knowledge of the network's underlying dynamics, making it suitable for applications where inference of such dynamics is intractable. We present extensive experiment results on two synthetic PBNs and the PBN model constructed directly from gene-expression data of a study on metastatic-melanoma. △ Less

Submitted 7 September, 2020; v1 submitted 7 September, 2019; originally announced September 2019.

arXiv:1408.5265 [pdf, other]

A Bayesian Ensemble Regression Framework on the Angry Birds Game

Authors: Nikolaos Tziortziotis, Georgios Papagiannis, Konstantinos Blekas

Abstract: An ensemble inference mechanism is proposed on the Angry Birds domain. It is based on an efficient tree structure for encoding and representing game screenshots, where it exploits its enhanced modeling capability. This has the advantage to establish an informative feature space and modify the task of game playing to a regression analysis problem. To this direction, we assume that each type of obje… ▽ More An ensemble inference mechanism is proposed on the Angry Birds domain. It is based on an efficient tree structure for encoding and representing game screenshots, where it exploits its enhanced modeling capability. This has the advantage to establish an informative feature space and modify the task of game playing to a regression analysis problem. To this direction, we assume that each type of object material and bird pair has its own Bayesian linear regression model. In this way, a multi-model regression framework is designed that simultaneously calculates the conditional expectations of several objects and makes a target decision through an ensemble of regression models. Learning procedure is performed according to an online estimation strategy for the model parameters. We provide comparative experimental results on several game levels that empirically illustrate the efficiency of the proposed methodology. △ Less

Submitted 25 August, 2014; v1 submitted 22 August, 2014; originally announced August 2014.

Comments: Angry Birds AI Symposium, ECAI 2014

Showing 1–9 of 9 results for author: Papagiannis, G