-
The dynamic interplay between in-context and in-weight learning in humans and neural networks
Authors:
Jacob Russin,
Ellie Pavlick,
Michael J. Frank
Abstract:
Human learning embodies a striking duality: sometimes, we appear capable of following logical, compositional rules and benefit from structured curricula (e.g., in formal education), while other times, we rely on an incremental approach or trial-and-error, learning better from curricula that are randomly interleaved. Influential psychological theories explain this seemingly disparate behavioral evi…
▽ More
Human learning embodies a striking duality: sometimes, we appear capable of following logical, compositional rules and benefit from structured curricula (e.g., in formal education), while other times, we rely on an incremental approach or trial-and-error, learning better from curricula that are randomly interleaved. Influential psychological theories explain this seemingly disparate behavioral evidence by positing two qualitatively different learning systems -- one for rapid, rule-based inferences and another for slow, incremental adaptation. It remains unclear how to reconcile such theories with neural networks, which learn via incremental weight updates and are thus a natural model for the latter type of learning, but are not obviously compatible with the former. However, recent evidence suggests that metalearning neural networks and large language models are capable of "in-context learning" (ICL) -- the ability to flexibly grasp the structure of a new task from a few examples. Here, we show that the dynamic interplay between ICL and default in-weight learning (IWL) naturally captures a broad range of learning phenomena observed in humans, reproducing curriculum effects on category-learning and compositional tasks, and recapitulating a tradeoff between flexibility and retention. Our work shows how emergent ICL can equip neural networks with fundamentally different learning properties that can coexist with their native IWL, thus offering a novel perspective on dual-process theories and human cognitive flexibility.
△ Less
Submitted 25 April, 2025; v1 submitted 13 February, 2024;
originally announced February 2024.
-
A Neural Network Model of Continual Learning with Cognitive Control
Authors:
Jacob Russin,
Maryam Zolfaghar,
Seongmin A. Park,
Erie Boorman,
Randall C. O'Reilly
Abstract:
Neural networks struggle in continual learning settings from catastrophic forgetting: when trials are blocked, new learning can overwrite the learning from previous blocks. Humans learn effectively in these settings, in some cases even showing an advantage of blocking, suggesting the brain contains mechanisms to overcome this problem. Here, we build on previous work and show that neural networks e…
▽ More
Neural networks struggle in continual learning settings from catastrophic forgetting: when trials are blocked, new learning can overwrite the learning from previous blocks. Humans learn effectively in these settings, in some cases even showing an advantage of blocking, suggesting the brain contains mechanisms to overcome this problem. Here, we build on previous work and show that neural networks equipped with a mechanism for cognitive control do not exhibit catastrophic forgetting when trials are blocked. We further show an advantage of blocking over interleaving when there is a bias for active maintenance in the control signal, implying a tradeoff between maintenance and the strength of control. Analyses of map-like representations learned by the networks provided additional insights into these mechanisms. Our work highlights the potential of cognitive control to aid continual learning in neural networks, and offers an explanation for the advantage of blocking that has been observed in humans.
△ Less
Submitted 3 November, 2022; v1 submitted 9 February, 2022;
originally announced February 2022.
-
The Structure of Systematicity in the Brain
Authors:
Randall C. O'Reilly,
Charan Ranganath,
Jacob L. Russin
Abstract:
A hallmark of human intelligence is the ability to adapt to new situations, by applying learned rules to new content (systematicity) and thereby enabling an open-ended number of inferences and actions (generativity). Here, we propose that the human brain accomplishes these feats through pathways in the parietal cortex that encode the abstract structure of space, events, and tasks, and pathways in…
▽ More
A hallmark of human intelligence is the ability to adapt to new situations, by applying learned rules to new content (systematicity) and thereby enabling an open-ended number of inferences and actions (generativity). Here, we propose that the human brain accomplishes these feats through pathways in the parietal cortex that encode the abstract structure of space, events, and tasks, and pathways in the temporal cortex that encode information about specific people, places, and things (content). Recent neural network models show how the separation of structure and content might emerge through a combination of architectural biases and learning, and these networks show dramatic improvements in the ability to capture systematic, generative behavior. We close by considering how the hippocampal formation may form integrative memories that enable rapid learning of new structure and content representations.
△ Less
Submitted 7 August, 2021;
originally announced August 2021.
-
Complementary Structure-Learning Neural Networks for Relational Reasoning
Authors:
Jacob Russin,
Maryam Zolfaghar,
Seongmin A. Park,
Erie Boorman,
Randall C. O'Reilly
Abstract:
The neural mechanisms supporting flexible relational inferences, especially in novel situations, are a major focus of current research. In the complementary learning systems framework, pattern separation in the hippocampus allows rapid learning in novel environments, while slower learning in neocortex accumulates small weight changes to extract systematic structure from well-learned environments.…
▽ More
The neural mechanisms supporting flexible relational inferences, especially in novel situations, are a major focus of current research. In the complementary learning systems framework, pattern separation in the hippocampus allows rapid learning in novel environments, while slower learning in neocortex accumulates small weight changes to extract systematic structure from well-learned environments. In this work, we adapt this framework to a task from a recent fMRI experiment where novel transitive inferences must be made according to implicit relational structure. We show that computational models capturing the basic cognitive properties of these two systems can explain relational transitive inferences in both familiar and novel environments, and reproduce key phenomena observed in the fMRI experiment.
△ Less
Submitted 19 May, 2021;
originally announced May 2021.
-
Deep Predictive Learning in Neocortex and Pulvinar
Authors:
Randall C. O'Reilly,
Jacob L. Russin,
Maryam Zolfaghar,
John Rohrlich
Abstract:
How do humans learn from raw sensory experience? Throughout life, but most obviously in infancy, we learn without explicit instruction. We propose a detailed biological mechanism for the widely-embraced idea that learning is based on the differences between predictions and actual outcomes (i.e., predictive error-driven learning). Specifically, numerous weak projections into the pulvinar nucleus of…
▽ More
How do humans learn from raw sensory experience? Throughout life, but most obviously in infancy, we learn without explicit instruction. We propose a detailed biological mechanism for the widely-embraced idea that learning is based on the differences between predictions and actual outcomes (i.e., predictive error-driven learning). Specifically, numerous weak projections into the pulvinar nucleus of the thalamus generate top-down predictions, and sparse, focal driver inputs from lower areas supply the actual outcome, originating in layer 5 intrinsic bursting (5IB) neurons. Thus, the outcome is only briefly activated, roughly every 100 msec (i.e., 10 Hz, alpha), resulting in a temporal difference error signal, which drives local synaptic changes throughout the neocortex, resulting in a biologically-plausible form of error backpropagation learning. We implemented these mechanisms in a large-scale model of the visual system, and found that the simulated inferotemporal (IT) pathway learns to systematically categorize 3D objects according to invariant shape properties, based solely on predictive learning from raw visual inputs. These categories match human judgments on the same stimuli, and are consistent with neural representations in IT cortex in primates.
△ Less
Submitted 28 January, 2021; v1 submitted 26 June, 2020;
originally announced June 2020.