Distributed Distributional Deterministic Policy Gradients

Barth-Maron, Gabriel; Hoffman, Matthew W.; Budden, David; Dabney, Will; Horgan, Dan; TB, Dhruva; Muldal, Alistair; Heess, Nicolas; Lillicrap, Timothy

Computer Science > Machine Learning

arXiv:1804.08617 (cs)

[Submitted on 23 Apr 2018]

Title:Distributed Distributional Deterministic Policy Gradients

Authors:Gabriel Barth-Maron, Matthew W. Hoffman, David Budden, Will Dabney, Dan Horgan, Dhruva TB, Alistair Muldal, Nicolas Heess, Timothy Lillicrap

View PDF

Abstract:This work adopts the very successful distributional perspective on reinforcement learning and adapts it to the continuous control setting. We combine this within a distributed framework for off-policy learning in order to develop what we call the Distributed Distributional Deep Deterministic Policy Gradient algorithm, D4PG. We also combine this technique with a number of additional, simple improvements such as the use of $N$-step returns and prioritized experience replay. Experimentally we examine the contribution of each of these individual components, and show how they interact, as well as their combined contributions. Our results show that across a wide variety of simple control tasks, difficult manipulation tasks, and a set of hard obstacle-based locomotion tasks the D4PG algorithm achieves state of the art performance.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:1804.08617 [cs.LG]
	(or arXiv:1804.08617v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1804.08617

Submission history

From: Matthew W. Hoffman [view email]
[v1] Mon, 23 Apr 2018 11:57:21 UTC (8,711 KB)

Computer Science > Machine Learning

Title:Distributed Distributional Deterministic Policy Gradients

Submission history

Access Paper:

References & Citations

2 blog links

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Distributed Distributional Deterministic Policy Gradients

Submission history

Access Paper:

References & Citations

2 blog links

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators