Examining average and discounted reward optimality criteria in reinforcement learning

Dewanto, Vektor; Gallagher, Marcus

Computer Science > Machine Learning

arXiv:2107.01348 (cs)

[Submitted on 3 Jul 2021 (v1), last revised 1 Sep 2022 (this version, v2)]

Title:Examining average and discounted reward optimality criteria in reinforcement learning

Authors:Vektor Dewanto, Marcus Gallagher

View PDF

Abstract:In reinforcement learning (RL), the goal is to obtain an optimal policy, for which the optimality criterion is fundamentally important. Two major optimality criteria are average and discounted rewards. While the latter is more popular, it is problematic to apply in environments without an inherent notion of discounting. This motivates us to revisit a) the progression of optimality criteria in dynamic programming, b) justification for and complication of an artificial discount factor, and c) benefits of directly maximizing the average reward criterion, which is discounting-free. Our contributions include a thorough examination of the relationship between average and discounted rewards, as well as a discussion of their pros and cons in RL. We emphasize that average-reward RL methods possess the ingredient and mechanism for applying a family of discounting-free optimality criteria (Veinott, 1969) to RL.

Comments:	23 pages, restructuring, adding more details
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO); Systems and Control (eess.SY)
Cite as:	arXiv:2107.01348 [cs.LG]
	(or arXiv:2107.01348v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2107.01348

Submission history

From: Vektor Dewanto [view email]
[v1] Sat, 3 Jul 2021 05:28:56 UTC (4,370 KB)
[v2] Thu, 1 Sep 2022 22:42:39 UTC (4,463 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2021-07

Change to browse by:

cs
cs.AI
cs.RO
cs.SY
eess
eess.SY

References & Citations

DBLP - CS Bibliography

listing | bibtex

Marcus Gallagher

export BibTeX citation

Computer Science > Machine Learning

Title:Examining average and discounted reward optimality criteria in reinforcement learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Examining average and discounted reward optimality criteria in reinforcement learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators