-
A Neural Network Model of Continual Learning with Cognitive Control
Authors:
Jacob Russin,
Maryam Zolfaghar,
Seongmin A. Park,
Erie Boorman,
Randall C. O'Reilly
Abstract:
Neural networks struggle in continual learning settings from catastrophic forgetting: when trials are blocked, new learning can overwrite the learning from previous blocks. Humans learn effectively in these settings, in some cases even showing an advantage of blocking, suggesting the brain contains mechanisms to overcome this problem. Here, we build on previous work and show that neural networks e…
▽ More
Neural networks struggle in continual learning settings from catastrophic forgetting: when trials are blocked, new learning can overwrite the learning from previous blocks. Humans learn effectively in these settings, in some cases even showing an advantage of blocking, suggesting the brain contains mechanisms to overcome this problem. Here, we build on previous work and show that neural networks equipped with a mechanism for cognitive control do not exhibit catastrophic forgetting when trials are blocked. We further show an advantage of blocking over interleaving when there is a bias for active maintenance in the control signal, implying a tradeoff between maintenance and the strength of control. Analyses of map-like representations learned by the networks provided additional insights into these mechanisms. Our work highlights the potential of cognitive control to aid continual learning in neural networks, and offers an explanation for the advantage of blocking that has been observed in humans.
△ Less
Submitted 3 November, 2022; v1 submitted 9 February, 2022;
originally announced February 2022.
-
Locally Learned Synaptic Dropout for Complete Bayesian Inference
Authors:
Kevin L. McKee,
Ian C. Crandell,
Rishidev Chaudhuri,
Randall C. O'Reilly
Abstract:
The Bayesian brain hypothesis postulates that the brain accurately operates on statistical distributions according to Bayes' theorem. The random failure of presynaptic vesicles to release neurotransmitters may allow the brain to sample from posterior distributions of network parameters, interpreted as epistemic uncertainty. It has not been shown previously how random failures might allow networks…
▽ More
The Bayesian brain hypothesis postulates that the brain accurately operates on statistical distributions according to Bayes' theorem. The random failure of presynaptic vesicles to release neurotransmitters may allow the brain to sample from posterior distributions of network parameters, interpreted as epistemic uncertainty. It has not been shown previously how random failures might allow networks to sample from observed distributions, also known as aleatoric or residual uncertainty. Sampling from both distributions enables probabilistic inference, efficient search, and creative or generative problem solving. We demonstrate that under a population-code based interpretation of neural activity, both types of distribution can be represented and sampled with synaptic failure alone. We first define a biologically constrained neural network and sampling scheme based on synaptic failure and lateral inhibition. Within this framework, we derive drop-out based epistemic uncertainty, then prove an analytic mapping from synaptic efficacy to release probability that allows networks to sample from arbitrary, learned distributions represented by a receiving layer. Second, our result leads to a local learning rule by which synapses adapt their release probabilities. Our result demonstrates complete Bayesian inference, related to the variational learning method of dropout, in a biologically constrained network using only locally-learned synaptic failure rates.
△ Less
Submitted 29 November, 2021; v1 submitted 18 November, 2021;
originally announced November 2021.
-
Statistical Learning in Speech: A Biologically Based Predictive Learning Model
Authors:
John Rohrlich,
Randall C. O'Reilly
Abstract:
Infants, adults, non-human primates and non-primates all learn patterns implicitly, and they do so across modalities. The biological evidence supports the hypothesis that the mechanism for this learning is general but computationally local. We hypothesize that the mechanism itself is predictive error-driven learning. We build on recent work that advanced a biologically plausible model of error bac…
▽ More
Infants, adults, non-human primates and non-primates all learn patterns implicitly, and they do so across modalities. The biological evidence supports the hypothesis that the mechanism for this learning is general but computationally local. We hypothesize that the mechanism itself is predictive error-driven learning. We build on recent work that advanced a biologically plausible model of error backpropagation learning which proposes that higher order thalamic nuclei provide a locale for a temporal difference between top-down predictions and an actual event outcome. Our neural network based on that work also models the auditory cortex hierarchy of core, belt and parabelt and the caudal-rostral axis within regions. We simulated two studies showing statistical learning in infants, a seminal study using synthesized speech and a more recent study using human speech. Before simulating these studies the network was trained on spoken sentences from the TIMIT corpus to emulate infant's experience listening to random speech. The implemented neural network, learning only by predicting the next brief speech segment, learned in both simulations to predict in-word syllables better than next-word syllables showing that prediction could be the basis for word segmentation and thus statistical learning.
△ Less
Submitted 13 August, 2021;
originally announced August 2021.
-
The Structure of Systematicity in the Brain
Authors:
Randall C. O'Reilly,
Charan Ranganath,
Jacob L. Russin
Abstract:
A hallmark of human intelligence is the ability to adapt to new situations, by applying learned rules to new content (systematicity) and thereby enabling an open-ended number of inferences and actions (generativity). Here, we propose that the human brain accomplishes these feats through pathways in the parietal cortex that encode the abstract structure of space, events, and tasks, and pathways in…
▽ More
A hallmark of human intelligence is the ability to adapt to new situations, by applying learned rules to new content (systematicity) and thereby enabling an open-ended number of inferences and actions (generativity). Here, we propose that the human brain accomplishes these feats through pathways in the parietal cortex that encode the abstract structure of space, events, and tasks, and pathways in the temporal cortex that encode information about specific people, places, and things (content). Recent neural network models show how the separation of structure and content might emerge through a combination of architectural biases and learning, and these networks show dramatic improvements in the ability to capture systematic, generative behavior. We close by considering how the hippocampal formation may form integrative memories that enable rapid learning of new structure and content representations.
△ Less
Submitted 7 August, 2021;
originally announced August 2021.
-
Complementary Structure-Learning Neural Networks for Relational Reasoning
Authors:
Jacob Russin,
Maryam Zolfaghar,
Seongmin A. Park,
Erie Boorman,
Randall C. O'Reilly
Abstract:
The neural mechanisms supporting flexible relational inferences, especially in novel situations, are a major focus of current research. In the complementary learning systems framework, pattern separation in the hippocampus allows rapid learning in novel environments, while slower learning in neocortex accumulates small weight changes to extract systematic structure from well-learned environments.…
▽ More
The neural mechanisms supporting flexible relational inferences, especially in novel situations, are a major focus of current research. In the complementary learning systems framework, pattern separation in the hippocampus allows rapid learning in novel environments, while slower learning in neocortex accumulates small weight changes to extract systematic structure from well-learned environments. In this work, we adapt this framework to a task from a recent fMRI experiment where novel transitive inferences must be made according to implicit relational structure. We show that computational models capturing the basic cognitive properties of these two systems can explain relational transitive inferences in both familiar and novel environments, and reproduce key phenomena observed in the fMRI experiment.
△ Less
Submitted 19 May, 2021;
originally announced May 2021.
-
Deep Predictive Learning in Neocortex and Pulvinar
Authors:
Randall C. O'Reilly,
Jacob L. Russin,
Maryam Zolfaghar,
John Rohrlich
Abstract:
How do humans learn from raw sensory experience? Throughout life, but most obviously in infancy, we learn without explicit instruction. We propose a detailed biological mechanism for the widely-embraced idea that learning is based on the differences between predictions and actual outcomes (i.e., predictive error-driven learning). Specifically, numerous weak projections into the pulvinar nucleus of…
▽ More
How do humans learn from raw sensory experience? Throughout life, but most obviously in infancy, we learn without explicit instruction. We propose a detailed biological mechanism for the widely-embraced idea that learning is based on the differences between predictions and actual outcomes (i.e., predictive error-driven learning). Specifically, numerous weak projections into the pulvinar nucleus of the thalamus generate top-down predictions, and sparse, focal driver inputs from lower areas supply the actual outcome, originating in layer 5 intrinsic bursting (5IB) neurons. Thus, the outcome is only briefly activated, roughly every 100 msec (i.e., 10 Hz, alpha), resulting in a temporal difference error signal, which drives local synaptic changes throughout the neocortex, resulting in a biologically-plausible form of error backpropagation learning. We implemented these mechanisms in a large-scale model of the visual system, and found that the simulated inferotemporal (IT) pathway learns to systematically categorize 3D objects according to invariant shape properties, based solely on predictive learning from raw visual inputs. These categories match human judgments on the same stimuli, and are consistent with neural representations in IT cortex in primates.
△ Less
Submitted 28 January, 2021; v1 submitted 26 June, 2020;
originally announced June 2020.
-
Deep Predictive Learning: A Comprehensive Model of Three Visual Streams
Authors:
Randall C. O'Reilly,
Dean R. Wyatte,
John Rohrlich
Abstract:
How does the neocortex learn and develop the foundations of all our high-level cognitive abilities? We present a comprehensive framework spanning biological, computational, and cognitive levels, with a clear theoretical continuity between levels, providing a coherent answer directly supported by extensive data at each level. Learning is based on making predictions about what the senses will report…
▽ More
How does the neocortex learn and develop the foundations of all our high-level cognitive abilities? We present a comprehensive framework spanning biological, computational, and cognitive levels, with a clear theoretical continuity between levels, providing a coherent answer directly supported by extensive data at each level. Learning is based on making predictions about what the senses will report at 100 msec (alpha frequency) intervals, and adapting synaptic weights to improve prediction accuracy. The pulvinar nucleus of the thalamus serves as a projection screen upon which predictions are generated, through deep-layer 6 corticothalamic inputs from multiple brain areas and levels of abstraction. The sparse driving inputs from layer 5 intrinsic bursting neurons provide the target signal, and the temporal difference between it and the prediction reverberates throughout the cortex, driving synaptic changes that approximate error backpropagation, using only local activation signals in equations derived directly from a detailed biophysical model. In vision, predictive learning requires a carefully-organized developmental progression and anatomical organization of three pathways (What, Where, and What * Where), according to two central principles: top-down input from compact, high-level, abstract representations is essential for accurate prediction of low-level sensory inputs; and the collective, low-level prediction error must be progressively and opportunistically partitioned to enable extraction of separable factors that drive the learning of further high-level abstractions. Our model self-organized systematic invariant object representations of 100 different objects from simple movies, accounts for a wide range of data, and makes many testable predictions.
△ Less
Submitted 14 September, 2017;
originally announced September 2017.
-
The cerebellum could solve the motor error problem through error increase prediction
Authors:
Sergio Verduzco-Flores,
Randall C. O'Reilly
Abstract:
We present a cerebellar architecture with two main characteristics. The first one is that complex spikes respond to increases in sensory errors. The second one is that cerebellar modules associate particular contexts where errors have increased in the past with corrective commands that stop the increase in error. We analyze our architecture formally and computationally for the case of reaching in…
▽ More
We present a cerebellar architecture with two main characteristics. The first one is that complex spikes respond to increases in sensory errors. The second one is that cerebellar modules associate particular contexts where errors have increased in the past with corrective commands that stop the increase in error. We analyze our architecture formally and computationally for the case of reaching in a 3D environment. In the case of motor control, we show that there are synergies of this architecture with the Equilibrium-Point hypothesis, leading to novel ways to solve the motor error problem. In particular, the presence of desired equilibrium lengths for muscles provides a way to know when the error is increasing, and which corrections to apply. In the context of Threshold Control Theory and Perceptual Control Theory we show how to extend our model so it implements anticipative corrections in cascade control systems that span from muscle contractions to cognitive operations.
△ Less
Submitted 21 February, 2015; v1 submitted 14 August, 2014;
originally announced August 2014.
-
Learning Through Time in the Thalamocortical Loops
Authors:
Randall C. O'Reilly,
Dean Wyatte,
John Rohrlich
Abstract:
We present a comprehensive, novel framework for understanding how the neocortex, including the thalamocortical loops through the deep layers, can support a temporal context representation in the service of predictive learning. Many have argued that predictive learning provides a compelling, powerful source of learning signals to drive the development of human intelligence: if we constantly predict…
▽ More
We present a comprehensive, novel framework for understanding how the neocortex, including the thalamocortical loops through the deep layers, can support a temporal context representation in the service of predictive learning. Many have argued that predictive learning provides a compelling, powerful source of learning signals to drive the development of human intelligence: if we constantly predict what will happen next, and learn based on the discrepancies from our predictions (error-driven learning), then we can learn to improve our predictions by developing internal representations that capture the regularities of the environment (e.g., physical laws governing the time-evolution of object motions). Our version of this idea builds upon existing work with simple recurrent networks (SRN's), which have a discretely-updated temporal context representations that are a direct copy of the prior internal state representation. We argue that this discretization of temporal context updating has a number of important computational and functional advantages, and further show how the strong alpha-frequency (10hz, 100ms cycle time) oscillations in the posterior neocortex could reflect this temporal context updating. We examine a wide range of data from biology to behavior through the lens of this LeabraTI model, and find that it provides a unified account of a number of otherwise disconnected findings, all of which converge to support this new model of neocortical learning and processing. We describe an implemented model showing how predictive learning of tumbling object trajectories can facilitate object recognition with cluttered backgrounds.
△ Less
Submitted 13 July, 2014;
originally announced July 2014.
-
Goal-Driven Cognition in the Brain: A Computational Framework
Authors:
Randall C. O'Reilly,
Thomas E. Hazy,
Jessica Mollick,
Prescott Mackie,
Seth Herd
Abstract:
Current theoretical and computational models of dopamine-based reinforcement learning are largely rooted in the classical behaviorist tradition, and envision the organism as a purely reactive recipient of rewards and punishments, with resulting behavior that essentially reflects the sum of this reinforcement history. This framework is missing some fundamental features of the affective nervous syst…
▽ More
Current theoretical and computational models of dopamine-based reinforcement learning are largely rooted in the classical behaviorist tradition, and envision the organism as a purely reactive recipient of rewards and punishments, with resulting behavior that essentially reflects the sum of this reinforcement history. This framework is missing some fundamental features of the affective nervous system, most importantly, the central role of goals in driving and organizing behavior in a teleological manner. Even when goal-directed behaviors are considered in current frameworks, they are typically conceived of as arising in reaction to the environment, rather than being in place from the start. We hypothesize that goal-driven cognition is primary, and organized into two discrete phases: goal selection and goal engaged, which each have a substantially different effective value function. This dichotomy can potentially explain a wide range of phenomena, playing a central role in many clinical disorders, such as depression, OCD, ADHD, and PTSD, and providing a sensible account of the detailed biology and function of the dopamine system and larger limbic system, including critical ventral and medial prefrontal cortex. Computationally, reasoning backward from active goals to action selection is more tractable than projecting alternative action choices forward to compute possible outcomes. An explicit computational model of these brain areas and their function in this goal-driven framework is described, as are numerous testable predictions from this framework.
△ Less
Submitted 30 April, 2014;
originally announced April 2014.