Environment Probing Interaction Policies

Zhou, Wenxuan; Pinto, Lerrel; Gupta, Abhinav

Computer Science > Robotics

arXiv:1907.11740 (cs)

[Submitted on 26 Jul 2019]

Title:Environment Probing Interaction Policies

Authors:Wenxuan Zhou, Lerrel Pinto, Abhinav Gupta

View PDF

Abstract:A key challenge in reinforcement learning (RL) is environment generalization: a policy trained to solve a task in one environment often fails to solve the same task in a slightly different test environment. A common approach to improve inter-environment transfer is to learn policies that are invariant to the distribution of testing environments. However, we argue that instead of being invariant, the policy should identify the specific nuances of an environment and exploit them to achieve better performance. In this work, we propose the 'Environment-Probing' Interaction (EPI) policy, a policy that probes a new environment to extract an implicit understanding of that environment's behavior. Once this environment-specific information is obtained, it is used as an additional input to a task-specific policy that can now perform environment-conditioned actions to solve a task. To learn these EPI-policies, we present a reward function based on transition predictability. Specifically, a higher reward is given if the trajectory generated by the EPI-policy can be used to better predict transitions. We experimentally show that EPI-conditioned task-specific policies significantly outperform commonly used policy generalization methods on novel testing environments.

Comments:	Published as a conference paper at ICLR 2019
Subjects:	Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:1907.11740 [cs.RO]
	(or arXiv:1907.11740v1 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.1907.11740

Submission history

From: Wenxuan Zhou [view email]
[v1] Fri, 26 Jul 2019 18:19:25 UTC (1,145 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.RO

< prev | next >

new | recent | 2019-07

Change to browse by:

cs
cs.AI
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Wenxuan Zhou
Lerrel Pinto
Abhinav Gupta

export BibTeX citation

Computer Science > Robotics

Title:Environment Probing Interaction Policies

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title:Environment Probing Interaction Policies

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators