-
Semantic Exploration from Language Abstractions and Pretrained Representations
Authors:
Allison C. Tam,
Neil C. Rabinowitz,
Andrew K. Lampinen,
Nicholas A. Roy,
Stephanie C. Y. Chan,
DJ Strouse,
Jane X. Wang,
Andrea Banino,
Felix Hill
Abstract:
Effective exploration is a challenge in reinforcement learning (RL). Novelty-based exploration methods can suffer in high-dimensional state spaces, such as continuous partially-observable 3D environments. We address this challenge by defining novelty using semantically meaningful state abstractions, which can be found in learned representations shaped by natural language. In particular, we evaluat…
▽ More
Effective exploration is a challenge in reinforcement learning (RL). Novelty-based exploration methods can suffer in high-dimensional state spaces, such as continuous partially-observable 3D environments. We address this challenge by defining novelty using semantically meaningful state abstractions, which can be found in learned representations shaped by natural language. In particular, we evaluate vision-language representations, pretrained on natural image captioning datasets. We show that these pretrained representations drive meaningful, task-relevant exploration and improve performance on 3D simulated environments. We also characterize why and how language provides useful abstractions for exploration by considering the impacts of using representations from a pretrained model, a language oracle, and several ablations. We demonstrate the benefits of our approach in two very different task domains -- one that stresses the identification and manipulation of everyday objects, and one that requires navigational exploration in an expansive world. Our results suggest that using language-shaped representations could improve exploration for various algorithms and agents in challenging environments.
△ Less
Submitted 26 April, 2023; v1 submitted 8 April, 2022;
originally announced April 2022.
-
Tell me why! Explanations support learning relational and causal structure
Authors:
Andrew K. Lampinen,
Nicholas A. Roy,
Ishita Dasgupta,
Stephanie C. Y. Chan,
Allison C. Tam,
James L. McClelland,
Chen Yan,
Adam Santoro,
Neil C. Rabinowitz,
Jane X. Wang,
Felix Hill
Abstract:
Inferring the abstract relational and causal structure of the world is a major challenge for reinforcement-learning (RL) agents. For humans, language--particularly in the form of explanations--plays a considerable role in overcoming this challenge. Here, we show that language can play a similar role for deep RL agents in complex environments. While agents typically struggle to acquire relational a…
▽ More
Inferring the abstract relational and causal structure of the world is a major challenge for reinforcement-learning (RL) agents. For humans, language--particularly in the form of explanations--plays a considerable role in overcoming this challenge. Here, we show that language can play a similar role for deep RL agents in complex environments. While agents typically struggle to acquire relational and causal knowledge, augmenting their experience by training them to predict language descriptions and explanations can overcome these limitations. We show that language can help agents learn challenging relational tasks, and examine which aspects of language contribute to its benefits. We then show that explanations can help agents to infer not only relational but also causal structure. Language can shape the way that agents to generalize out-of-distribution from ambiguous, causally-confounded training, and explanations even allow agents to learn to perform experimental interventions to identify causal relationships. Our results suggest that language description and explanation may be powerful tools for improving agent learning and generalization.
△ Less
Submitted 25 May, 2022; v1 submitted 7 December, 2021;
originally announced December 2021.
-
Meta-learners' learning dynamics are unlike learners'
Authors:
Neil C. Rabinowitz
Abstract:
Meta-learning is a tool that allows us to build sample-efficient learning systems. Here we show that, once meta-trained, LSTM Meta-Learners aren't just faster learners than their sample-inefficient deep learning (DL) and reinforcement learning (RL) brethren, but that they actually pursue fundamentally different learning trajectories. We study their learning dynamics on three sets of structured tas…
▽ More
Meta-learning is a tool that allows us to build sample-efficient learning systems. Here we show that, once meta-trained, LSTM Meta-Learners aren't just faster learners than their sample-inefficient deep learning (DL) and reinforcement learning (RL) brethren, but that they actually pursue fundamentally different learning trajectories. We study their learning dynamics on three sets of structured tasks for which the corresponding learning dynamics of DL and RL systems have been previously described: linear regression (Saxe et al., 2013), nonlinear regression (Rahaman et al., 2018; Xu et al., 2018), and contextual bandits (Schaul et al., 2019). In each case, while sample-inefficient DL and RL Learners uncover the task structure in a staggered manner, meta-trained LSTM Meta-Learners uncover almost all task structure concurrently, congruent with the patterns expected from Bayes-optimal inference algorithms. This has implications for research areas wherever the learning behaviour itself is of interest, such as safety, curriculum design, and human-in-the-loop machine learning.
△ Less
Submitted 3 May, 2019;
originally announced May 2019.
-
Relational Forward Models for Multi-Agent Learning
Authors:
Andrea Tacchetti,
H. Francis Song,
Pedro A. M. Mediano,
Vinicius Zambaldi,
Neil C. Rabinowitz,
Thore Graepel,
Matthew Botvinick,
Peter W. Battaglia
Abstract:
The behavioral dynamics of multi-agent systems have a rich and orderly structure, which can be leveraged to understand these systems, and to improve how artificial agents learn to operate in them. Here we introduce Relational Forward Models (RFM) for multi-agent learning, networks that can learn to make accurate predictions of agents' future behavior in multi-agent environments. Because these mode…
▽ More
The behavioral dynamics of multi-agent systems have a rich and orderly structure, which can be leveraged to understand these systems, and to improve how artificial agents learn to operate in them. Here we introduce Relational Forward Models (RFM) for multi-agent learning, networks that can learn to make accurate predictions of agents' future behavior in multi-agent environments. Because these models operate on the discrete entities and relations present in the environment, they produce interpretable intermediate representations which offer insights into what drives agents' behavior, and what events mediate the intensity and valence of social interactions. Furthermore, we show that embedding RFM modules inside agents results in faster learning systems compared to non-augmented baselines. As more and more of the autonomous systems we develop and interact with become multi-agent in nature, developing richer analysis tools for characterizing how and why agents make decisions is increasingly necessary. Moreover, developing artificial agents that quickly and safely learn to coordinate with one another, and with humans in shared environments, is crucial.
△ Less
Submitted 28 September, 2018;
originally announced September 2018.
-
Human-level performance in first-person multiplayer games with population-based deep reinforcement learning
Authors:
Max Jaderberg,
Wojciech M. Czarnecki,
Iain Dunning,
Luke Marris,
Guy Lever,
Antonio Garcia Castaneda,
Charles Beattie,
Neil C. Rabinowitz,
Ari S. Morcos,
Avraham Ruderman,
Nicolas Sonnerat,
Tim Green,
Louise Deason,
Joel Z. Leibo,
David Silver,
Demis Hassabis,
Koray Kavukcuoglu,
Thore Graepel
Abstract:
Recent progress in artificial intelligence through reinforcement learning (RL) has shown great success on increasingly complex single-agent environments and two-player turn-based games. However, the real-world contains multiple agents, each learning and acting independently to cooperate and compete with other agents, and environments reflecting this degree of complexity remain an open challenge. I…
▽ More
Recent progress in artificial intelligence through reinforcement learning (RL) has shown great success on increasingly complex single-agent environments and two-player turn-based games. However, the real-world contains multiple agents, each learning and acting independently to cooperate and compete with other agents, and environments reflecting this degree of complexity remain an open challenge. In this work, we demonstrate for the first time that an agent can achieve human-level in a popular 3D multiplayer first-person video game, Quake III Arena Capture the Flag, using only pixels and game points as input. These results were achieved by a novel two-tier optimisation process in which a population of independent RL agents are trained concurrently from thousands of parallel matches with agents playing in teams together and against each other on randomly generated environments. Each agent in the population learns its own internal reward signal to complement the sparse delayed reward from winning, and selects actions using a novel temporally hierarchical representation that enables the agent to reason at multiple timescales. During game-play, these agents display human-like behaviours such as navigating, following, and defending based on a rich learned representation that is shown to encode high-level game knowledge. In an extensive tournament-style evaluation the trained agents exceeded the win-rate of strong human players both as teammates and opponents, and proved far stronger than existing state-of-the-art agents. These results demonstrate a significant jump in the capabilities of artificial agents, bringing us closer to the goal of human-level intelligence.
△ Less
Submitted 3 July, 2018;
originally announced July 2018.
-
Pooling is neither necessary nor sufficient for appropriate deformation stability in CNNs
Authors:
Avraham Ruderman,
Neil C. Rabinowitz,
Ari S. Morcos,
Daniel Zoran
Abstract:
Many of our core assumptions about how neural networks operate remain empirically untested. One common assumption is that convolutional neural networks need to be stable to small translations and deformations to solve image recognition tasks. For many years, this stability was baked into CNN architectures by incorporating interleaved pooling layers. Recently, however, interleaved pooling has large…
▽ More
Many of our core assumptions about how neural networks operate remain empirically untested. One common assumption is that convolutional neural networks need to be stable to small translations and deformations to solve image recognition tasks. For many years, this stability was baked into CNN architectures by incorporating interleaved pooling layers. Recently, however, interleaved pooling has largely been abandoned. This raises a number of questions: Are our intuitions about deformation stability right at all? Is it important? Is pooling necessary for deformation invariance? If not, how is deformation invariance achieved in its absence? In this work, we rigorously test these questions, and find that deformation stability in convolutional networks is more nuanced than it first appears: (1) Deformation invariance is not a binary property, but rather that different tasks require different degrees of deformation stability at different layers. (2) Deformation stability is not a fixed property of a network and is heavily adjusted over the course of training, largely through the smoothness of the convolutional filters. (3) Interleaved pooling layers are neither necessary nor sufficient for achieving the optimal form of deformation stability for natural image classification. (4) Pooling confers too much deformation stability for image classification at initialization, and during training, networks have to learn to counteract this inductive bias. Together, these findings provide new insights into the role of interleaved pooling and deformation invariance in CNNs, and demonstrate the importance of rigorous empirical testing of even our most basic assumptions about the working of neural networks.
△ Less
Submitted 25 May, 2018; v1 submitted 12 April, 2018;
originally announced April 2018.
-
On the importance of single directions for generalization
Authors:
Ari S. Morcos,
David G. T. Barrett,
Neil C. Rabinowitz,
Matthew Botvinick
Abstract:
Despite their ability to memorize large datasets, deep neural networks often achieve good generalization performance. However, the differences between the learned solutions of networks which generalize and those which do not remain unclear. Additionally, the tuning properties of single directions (defined as the activation of a single unit or some linear combination of units in response to some in…
▽ More
Despite their ability to memorize large datasets, deep neural networks often achieve good generalization performance. However, the differences between the learned solutions of networks which generalize and those which do not remain unclear. Additionally, the tuning properties of single directions (defined as the activation of a single unit or some linear combination of units in response to some input) have been highlighted, but their importance has not been evaluated. Here, we connect these lines of inquiry to demonstrate that a network's reliance on single directions is a good predictor of its generalization performance, across networks trained on datasets with different fractions of corrupted labels, across ensembles of networks trained on datasets with unmodified labels, across different hyperparameters, and over the course of training. While dropout only regularizes this quantity up to a point, batch normalization implicitly discourages single direction reliance, in part by decreasing the class selectivity of individual units. Finally, we find that class selectivity is a poor predictor of task importance, suggesting not only that networks which generalize well minimize their dependence on individual units by reducing their selectivity, but also that individually selective units may not be necessary for strong network performance.
△ Less
Submitted 22 May, 2018; v1 submitted 19 March, 2018;
originally announced March 2018.
-
Machine Theory of Mind
Authors:
Neil C. Rabinowitz,
Frank Perbet,
H. Francis Song,
Chiyuan Zhang,
S. M. Ali Eslami,
Matthew Botvinick
Abstract:
Theory of mind (ToM; Premack & Woodruff, 1978) broadly refers to humans' ability to represent the mental states of others, including their desires, beliefs, and intentions. We propose to train a machine to build such models too. We design a Theory of Mind neural network -- a ToMnet -- which uses meta-learning to build models of the agents it encounters, from observations of their behaviour alone.…
▽ More
Theory of mind (ToM; Premack & Woodruff, 1978) broadly refers to humans' ability to represent the mental states of others, including their desires, beliefs, and intentions. We propose to train a machine to build such models too. We design a Theory of Mind neural network -- a ToMnet -- which uses meta-learning to build models of the agents it encounters, from observations of their behaviour alone. Through this process, it acquires a strong prior model for agents' behaviour, as well as the ability to bootstrap to richer predictions about agents' characteristics and mental states using only a small number of behavioural observations. We apply the ToMnet to agents behaving in simple gridworld environments, showing that it learns to model random, algorithmic, and deep reinforcement learning agents from varied populations, and that it passes classic ToM tasks such as the "Sally-Anne" test (Wimmer & Perner, 1983; Baron-Cohen et al., 1985) of recognising that others can hold false beliefs about the world. We argue that this system -- which autonomously learns how to model other agents in its world -- is an important step forward for developing multi-agent AI systems, for building intermediating technology for machine-human interaction, and for advancing the progress on interpretable AI.
△ Less
Submitted 12 March, 2018; v1 submitted 21 February, 2018;
originally announced February 2018.
-
Building Machines that Learn and Think for Themselves: Commentary on Lake et al., Behavioral and Brain Sciences, 2017
Authors:
M. Botvinick,
D. G. T. Barrett,
P. Battaglia,
N. de Freitas,
D. Kumaran,
J. Z Leibo,
T. Lillicrap,
J. Modayil,
S. Mohamed,
N. C. Rabinowitz,
D. J. Rezende,
A. Santoro,
T. Schaul,
C. Summerfield,
G. Wayne,
T. Weber,
D. Wierstra,
S. Legg,
D. Hassabis
Abstract:
We agree with Lake and colleagues on their list of key ingredients for building humanlike intelligence, including the idea that model-based reasoning is essential. However, we favor an approach that centers on one additional ingredient: autonomy. In particular, we aim toward agents that can both build and exploit their own internal models, with minimal human hand-engineering. We believe an approac…
▽ More
We agree with Lake and colleagues on their list of key ingredients for building humanlike intelligence, including the idea that model-based reasoning is essential. However, we favor an approach that centers on one additional ingredient: autonomy. In particular, we aim toward agents that can both build and exploit their own internal models, with minimal human hand-engineering. We believe an approach centered on autonomous learning has the greatest chance of success as we scale toward real-world complexity, tackling domains for which ready-made formal models are not available. Here we survey several important examples of the progress that has been made toward building autonomous agents with humanlike abilities, and highlight some outstanding challenges.
△ Less
Submitted 22 November, 2017;
originally announced November 2017.
-
Progressive Neural Networks
Authors:
Andrei A. Rusu,
Neil C. Rabinowitz,
Guillaume Desjardins,
Hubert Soyer,
James Kirkpatrick,
Koray Kavukcuoglu,
Razvan Pascanu,
Raia Hadsell
Abstract:
Learning to solve complex sequences of tasks--while both leveraging transfer and avoiding catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The progressive networks approach represents a step forward in this direction: they are immune to forgetting and can leverage prior knowledge via lateral connections to previously learned features. We evaluate this architec…
▽ More
Learning to solve complex sequences of tasks--while both leveraging transfer and avoiding catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The progressive networks approach represents a step forward in this direction: they are immune to forgetting and can leverage prior knowledge via lateral connections to previously learned features. We evaluate this architecture extensively on a wide variety of reinforcement learning tasks (Atari and 3D maze games), and show that it outperforms common baselines based on pretraining and finetuning. Using a novel sensitivity measure, we demonstrate that transfer occurs at both low-level sensory and high-level control layers of the learned policy.
△ Less
Submitted 22 October, 2022; v1 submitted 15 June, 2016;
originally announced June 2016.
-
The local low-dimensionality of natural images
Authors:
Olivier J. Hénaff,
Johannes Ballé,
Neil C. Rabinowitz,
Eero P. Simoncelli
Abstract:
We develop a new statistical model for photographic images, in which the local responses of a bank of linear filters are described as jointly Gaussian, with zero mean and a covariance that varies slowly over spatial position. We optimize sets of filters so as to minimize the nuclear norms of matrices of their local activations (i.e., the sum of the singular values), thus encouraging a flexible for…
▽ More
We develop a new statistical model for photographic images, in which the local responses of a bank of linear filters are described as jointly Gaussian, with zero mean and a covariance that varies slowly over spatial position. We optimize sets of filters so as to minimize the nuclear norms of matrices of their local activations (i.e., the sum of the singular values), thus encouraging a flexible form of sparsity that is not tied to any particular dictionary or coordinate system. Filters optimized according to this objective are oriented and bandpass, and their responses exhibit substantial local correlation. We show that images can be reconstructed nearly perfectly from estimates of the local filter response covariances alone, and with minimal degradation (either visual or MSE) from low-rank approximations of these covariances. As such, this representation holds much promise for use in applications such as denoising, compression, and texture representation, and may form a useful substrate for hierarchical decompositions.
△ Less
Submitted 23 March, 2015; v1 submitted 20 December, 2014;
originally announced December 2014.