Programmatically Interpretable Reinforcement Learning

Verma, Abhinav; Murali, Vijayaraghavan; Singh, Rishabh; Kohli, Pushmeet; Chaudhuri, Swarat

Computer Science > Machine Learning

arXiv:1804.02477v1 (cs)

[Submitted on 6 Apr 2018 (this version), latest version 10 Apr 2019 (v3)]

Title:Programmatically Interpretable Reinforcement Learning

Authors:Abhinav Verma, Vijayaraghavan Murali, Rishabh Singh, Pushmeet Kohli, Swarat Chaudhuri

View PDF

Abstract:We study the problem of generating interpretable and verifiable policies through reinforcement learning. Unlike the popular Deep Reinforcement Learning (DRL) paradigm, in which the policy is represented by a neural network, the aim in Programmatically Interpretable Reinforcement Learning is to find a policy that can be represented in a high-level programming language. Such programmatic policies have the benefits of being more easily interpreted than neural networks, and being amenable to verification by symbolic methods. We propose a new method, called Neurally Directed Program Search (NDPS), for solving the challenging nonsmooth optimization problem of finding a programmatic policy with maxima reward. NDPS works by first learning a neural policy network using DRL, and then performing a local search over programmatic policies that seeks to minimize a distance from this neural "oracle". We evaluate NDPS on the task of learning to drive a simulated car in the TORCS car-racing environment. We demonstrate that NDPS is able to discover human-readable policies that pass some significant performance bars. We also find that a well-designed policy language can serve as a regularizer, and result in the discovery of policies that lead to smoother trajectories and are more easily transferred to environments not encountered during training.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Programming Languages (cs.PL); Machine Learning (stat.ML)
Cite as:	arXiv:1804.02477 [cs.LG]
	(or arXiv:1804.02477v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1804.02477

Submission history

From: Abhinav Verma [view email]
[v1] Fri, 6 Apr 2018 22:17:18 UTC (108 KB)
[v2] Fri, 8 Jun 2018 02:27:26 UTC (884 KB)
[v3] Wed, 10 Apr 2019 09:09:46 UTC (1,362 KB)

Computer Science > Machine Learning

Title:Programmatically Interpretable Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Programmatically Interpretable Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators