Search | arXiv e-print repository

arXiv:1911.00497 [pdf, other]

A Narration-based Reward Shaping Approach using Grounded Natural Language Commands

Authors: Nicholas Waytowich, Sean L. Barton, Vernon Lawhern, Garrett Warnell

Abstract: While deep reinforcement learning techniques have led to agents that are successfully able to learn to perform a number of tasks that had been previously unlearnable, these techniques are still susceptible to the longstanding problem of reward sparsity. This is especially true for tasks such as training an agent to play StarCraft II, a real-time strategy game where reward is only given at the end… ▽ More While deep reinforcement learning techniques have led to agents that are successfully able to learn to perform a number of tasks that had been previously unlearnable, these techniques are still susceptible to the longstanding problem of reward sparsity. This is especially true for tasks such as training an agent to play StarCraft II, a real-time strategy game where reward is only given at the end of a game which is usually very long. While this problem can be addressed through reward shaping, such approaches typically require a human expert with specialized knowledge. Inspired by the vision of enabling reward shaping through the more-accessible paradigm of natural-language narration, we develop a technique that can provide the benefits of reward shaping using natural language commands. Our narration-guided RL agent projects sequences of natural-language commands into the same high-dimensional representation space as corresponding goal states. We show that we can get improved performance with our method compared to traditional reward-shaping approaches. Additionally, we demonstrate the ability of our method to generalize to unseen natural-language commands. △ Less

Submitted 31 October, 2019; originally announced November 2019.

Comments: Presented at the Imitation, Intent and Interaction (I3) workshop, ICML 2019. arXiv admin note: substantial text overlap with arXiv:1906.02671

arXiv:1906.02671 [pdf, other]

Grounding Natural Language Commands to StarCraft II Game States for Narration-Guided Reinforcement Learning

Authors: Nicholas Waytowich, Sean L. Barton, Vernon Lawhern, Ethan Stump, Garrett Warnell

Abstract: While deep reinforcement learning techniques have led to agents that are successfully able to learn to perform a number of tasks that had been previously unlearnable, these techniques are still susceptible to the longstanding problem of {\em reward sparsity}. This is especially true for tasks such as training an agent to play StarCraft II, a real-time strategy game where reward is only given at th… ▽ More While deep reinforcement learning techniques have led to agents that are successfully able to learn to perform a number of tasks that had been previously unlearnable, these techniques are still susceptible to the longstanding problem of {\em reward sparsity}. This is especially true for tasks such as training an agent to play StarCraft II, a real-time strategy game where reward is only given at the end of a game which is usually very long. While this problem can be addressed through reward shaping, such approaches typically require a human expert with specialized knowledge. Inspired by the vision of enabling reward shaping through the more-accessible paradigm of natural-language narration, we investigate to what extent we can contextualize these narrations by grounding them to the goal-specific states. We present a mutual-embedding model using a multi-input deep-neural network that projects a sequence of natural language commands into the same high-dimensional representation space as corresponding goal states. We show that using this model we can learn an embedding space with separable and distinct clusters that accurately maps natural-language commands to corresponding game states . We also discuss how this model can allow for the use of narrations as a robust form of reward shaping to improve RL performance and efficiency. △ Less

Submitted 24 April, 2019; originally announced June 2019.

Comments: 10 pages, 3 figures. Published at SPIE 2019

arXiv:1809.04918 [pdf, ps, other]

Coordination-driven learning in multi-agent problem spaces

Authors: Sean L. Barton, Nicholas R. Waytowich, Derrik E. Asher

Abstract: We discuss the role of coordination as a direct learning objective in multi-agent reinforcement learning (MARL) domains. To this end, we present a novel means of quantifying coordination in multi-agent systems, and discuss the implications of using such a measure to optimize coordinated agent policies. This concept has important implications for adversary-aware RL, which we take to be a sub-domain… ▽ More We discuss the role of coordination as a direct learning objective in multi-agent reinforcement learning (MARL) domains. To this end, we present a novel means of quantifying coordination in multi-agent systems, and discuss the implications of using such a measure to optimize coordinated agent policies. This concept has important implications for adversary-aware RL, which we take to be a sub-domain of multi-agent learning. △ Less

Submitted 13 September, 2018; originally announced September 2018.

Comments: AAAI Fall Symposium 2018, Concept Paper

Report number: Vol-2269 FSS-18

Journal ref: Proceedings of the AAAI Fall 2018 Sympo?sium on Adversary-Aware Learning Techniques and Trends in Cy?bersecurity, Arlington, VA, USA, 18-19 October, 2018, published at http://ceur-ws.org

arXiv:1807.08663 [pdf]

Measuring collaborative emergent behavior in multi-agent reinforcement learning

Authors: Sean L. Barton, Nicholas R. Waytowich, Erin Zaroukian, Derrik E. Asher

Abstract: Multi-agent reinforcement learning (RL) has important implications for the future of human-agent teaming. We show that improved performance with multi-agent RL is not a guarantee of the collaborative behavior thought to be important for solving multi-agent tasks. To address this, we present a novel approach for quantitatively assessing collaboration in continuous spatial tasks with multi-agent RL.… ▽ More Multi-agent reinforcement learning (RL) has important implications for the future of human-agent teaming. We show that improved performance with multi-agent RL is not a guarantee of the collaborative behavior thought to be important for solving multi-agent tasks. To address this, we present a novel approach for quantitatively assessing collaboration in continuous spatial tasks with multi-agent RL. Such a metric is useful for measuring collaboration between computational agents and may serve as a training signal for collaboration in future RL paradigms involving humans. △ Less

Submitted 23 July, 2018; originally announced July 2018.

Comments: 1st International Conference on Human Systems Engineering and Design, 6 pages, 2 figures, 1 table

arXiv:1807.05806 [pdf]

Adapting the Predator-Prey Game Theoretic Environment to Army Tactical Edge Scenarios with Computational Multiagent Systems

Authors: Derrik E. Asher, Erin Zaroukian, Sean L. Barton

Abstract: The historical origins of the game theoretic predator-prey pursuit problem can be traced back to Benda, et al., 1985 [1]. Their work adapted the predator-prey ecology problem into a pursuit environment which focused on the dynamics of cooperative behavior between predator agents. Modifications to the predator-prey ecology problem [2] have been implemented to understand how variations to predator [… ▽ More The historical origins of the game theoretic predator-prey pursuit problem can be traced back to Benda, et al., 1985 [1]. Their work adapted the predator-prey ecology problem into a pursuit environment which focused on the dynamics of cooperative behavior between predator agents. Modifications to the predator-prey ecology problem [2] have been implemented to understand how variations to predator [3] and prey [3-5] attributes, including communication [6], can modify dynamic interactions between entities that emerge within that environment [7-9]. Furthermore, the predator-prey pursuit environment has become a testbed for simulation experiments with computational multiagent systems [10-12]. This article extends the theoretical contributions of previous work by providing 1) additional variations to predator and prey attributes for simulated multiagent systems in the pursuit problem, and 2) military-relevant predator-prey environments simulating highly dynamic, tactical edge scenarios that Soldiers might encounter on future battlefields. Through this exploration of simulated tactical edge scenarios with computational multiagent systems, Soldiers will have a greater chance to achieve overmatch on the battlefields of tomorrow. △ Less

Submitted 16 July, 2018; originally announced July 2018.

Comments: Concept paper: Modifying the predator-prey pursuit environment to simulate tactical edge scenarios, 9 pages, 1 figure, International Command and Control Research and Technology Symposium (ICCRTS - 2018)

Report number: ARL-TR-8453

Journal ref: US Army Research Laboratory Aberdeen Proving Ground United States, 2018

Showing 1–5 of 5 results for author: Barton, S L