Learning Sensorimotor Primitives of Sequential Manipulation Tasks from Visual Demonstrations

Liang, Junchi; Wen, Bowen; Bekris, Kostas; Boularias, Abdeslam

Abstract:This work aims to learn how to perform complex robot manipulation tasks that are composed of several, consecutively executed low-level sub-tasks, given as input a few visual demonstrations of the tasks performed by a person. The sub-tasks consist of moving the robot's end-effector until it reaches a sub-goal region in the task space, performing an action, and triggering the next sub-task when a pre-condition is met. Most prior work in this domain has been concerned with learning only low-level tasks, such as hitting a ball or reaching an object and grasping it. This paper describes a new neural network-based framework for learning simultaneously low-level policies as well as high-level policies, such as deciding which object to pick next or where to place it relative to other objects in the scene. A key feature of the proposed approach is that the policies are learned directly from raw videos of task demonstrations, without any manual annotation or post-processing of the data. Empirical results on object manipulation tasks with a robotic arm show that the proposed network can efficiently learn from real visual demonstrations to perform the tasks, and outperforms popular imitation learning algorithms.

Subjects:	Robotics (cs.RO); Machine Learning (cs.LG)
Cite as:	arXiv:2203.03797 [cs.RO]
	(or arXiv:2203.03797v1 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2203.03797

Computer Science > Robotics

Title:Learning Sensorimotor Primitives of Sequential Manipulation Tasks from Visual Demonstrations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators