Sequence Modeling of Temporal Credit Assignment for Episodic Reinforcement Learning

Liu, Yang; Luo, Yunan; Zhong, Yuanyi; Chen, Xi; Liu, Qiang; Peng, Jian

Computer Science > Machine Learning

arXiv:1905.13420 (cs)

[Submitted on 31 May 2019]

Title:Sequence Modeling of Temporal Credit Assignment for Episodic Reinforcement Learning

Authors:Yang Liu, Yunan Luo, Yuanyi Zhong, Xi Chen, Qiang Liu, Jian Peng

View PDF

Abstract:Recent advances in deep reinforcement learning algorithms have shown great potential and success for solving many challenging real-world problems, including Go game and robotic applications. Usually, these algorithms need a carefully designed reward function to guide training in each time step. However, in real world, it is non-trivial to design such a reward function, and the only signal available is usually obtained at the end of a trajectory, also known as the episodic reward or return. In this work, we introduce a new algorithm for temporal credit assignment, which learns to decompose the episodic return back to each time-step in the trajectory using deep neural networks. With this learned reward signal, the learning efficiency can be substantially improved for episodic reinforcement learning. In particular, we find that expressive language models such as the Transformer can be adopted for learning the importance and the dependency of states in the trajectory, therefore providing high-quality and interpretable learned reward signals. We have performed extensive experiments on a set of MuJoCo continuous locomotive control tasks with only episodic returns and demonstrated the effectiveness of our algorithm.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1905.13420 [cs.LG]
	(or arXiv:1905.13420v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1905.13420

Submission history

From: Yang Liu [view email]
[v1] Fri, 31 May 2019 05:20:12 UTC (3,730 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2019-05

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Yang Liu
Yunan Luo
Yuanyi Zhong
Xi Chen
Qiang Liu

…

export BibTeX citation

Computer Science > Machine Learning

Title:Sequence Modeling of Temporal Credit Assignment for Episodic Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Sequence Modeling of Temporal Credit Assignment for Episodic Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators