Ranking Policy Decisions

Pouget, Hadrien; Chockler, Hana; Sun, Youcheng; Kroening, Daniel

Computer Science > Machine Learning

arXiv:2008.13607v1 (cs)

[Submitted on 31 Aug 2020 (this version), latest version 26 Oct 2021 (v3)]

Title:Ranking Policy Decisions

Authors:Hadrien Pouget, Hana Chockler, Youcheng Sun, Daniel Kroening

View PDF

Abstract:Policies trained via Reinforcement Learning (RL) are often needlessly complex, making them more difficult to analyse and interpret. In a run with $n$ time steps, a policy will decide $n$ times on an action to take, even when only a tiny subset of these decisions deliver value over selecting a simple default action. Given a pre-trained policy, we propose a black-box method based on statistical fault localisation that ranks the states of the environment according to the importance of decisions made in those states. We evaluate our ranking method by creating new, simpler policies by pruning decisions identified as unimportant, and measure the impact on performance. Our experimental results on a diverse set of standard benchmarks (gridworld, CartPole, Atari games) show that in some cases less than half of the decisions made contribute to the expected reward. We furthermore show that the decisions made in the most frequently visited states are not the most important for the expected reward.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2008.13607 [cs.LG]
	(or arXiv:2008.13607v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2008.13607

Submission history

From: Hadrien Pouget [view email]
[v1] Mon, 31 Aug 2020 13:54:44 UTC (636 KB)
[v2] Sun, 3 Oct 2021 15:57:30 UTC (636 KB)
[v3] Tue, 26 Oct 2021 17:28:22 UTC (1,754 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2020-08

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Hana Chockler
Youcheng Sun
Daniel Kroening

export BibTeX citation

Computer Science > Machine Learning

Title:Ranking Policy Decisions

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Ranking Policy Decisions

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators