Learning and reusing primitive behaviours to improve Hindsight Experience Replay sample efficiency

Sanchez, Francisco Roldan; Wang, Qiang; Bulens, David Cordova; McGuinness, Kevin; Redmond, Stephen; O'Connor, Noel

Computer Science > Robotics

arXiv:2310.01827 (cs)

[Submitted on 3 Oct 2023 (v1), last revised 19 Nov 2023 (this version, v2)]

Title:Learning and reusing primitive behaviours to improve Hindsight Experience Replay sample efficiency

Authors:Francisco Roldan Sanchez, Qiang Wang, David Cordova Bulens, Kevin McGuinness, Stephen Redmond, Noel O'Connor

View PDF

Abstract:Hindsight Experience Replay (HER) is a technique used in reinforcement learning (RL) that has proven to be very efficient for training off-policy RL-based agents to solve goal-based robotic manipulation tasks using sparse rewards. Even though HER improves the sample efficiency of RL-based agents by learning from mistakes made in past experiences, it does not provide any guidance while exploring the environment. This leads to very large training times due to the volume of experience required to train an agent using this replay strategy. In this paper, we propose a method that uses primitive behaviours that have been previously learned to solve simple tasks in order to guide the agent toward more rewarding actions during exploration while learning other more complex tasks. This guidance, however, is not executed by a manually designed curriculum, but rather using a critic network to decide at each timestep whether or not to use the actions proposed by the previously-learned primitive policies. We evaluate our method by comparing its performance against HER and other more efficient variations of this algorithm in several block manipulation tasks. We demonstrate the agents can learn a successful policy faster when using our proposed method, both in terms of sample efficiency and computation time. Code is available at this https URL.

Comments:	6 pages, 2 figures, 1 algorithm, 1 table. Version accepted to ICARA 2024
Subjects:	Robotics (cs.RO); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2310.01827 [cs.RO]
	(or arXiv:2310.01827v2 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2310.01827

Submission history

From: Francisco Roldan Sanchez [view email]
[v1] Tue, 3 Oct 2023 06:49:57 UTC (1,295 KB)
[v2] Sun, 19 Nov 2023 15:55:56 UTC (1,296 KB)

Computer Science > Robotics

Title:Learning and reusing primitive behaviours to improve Hindsight Experience Replay sample efficiency

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title:Learning and reusing primitive behaviours to improve Hindsight Experience Replay sample efficiency

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators