-
Transient dynamics of associative memory models
Authors:
David G. Clark
Abstract:
Associative memory models such as the Hopfield network and its dense generalizations with higher-order interactions exhibit a "blackout catastrophe"--a discontinuous transition where stable memory states abruptly vanish when the number of stored patterns exceeds a critical capacity. This transition is often interpreted as rendering networks unusable beyond capacity limits. We argue that this inter…
▽ More
Associative memory models such as the Hopfield network and its dense generalizations with higher-order interactions exhibit a "blackout catastrophe"--a discontinuous transition where stable memory states abruptly vanish when the number of stored patterns exceeds a critical capacity. This transition is often interpreted as rendering networks unusable beyond capacity limits. We argue that this interpretation is largely an artifact of the equilibrium perspective. We derive dynamical mean-field equations using a bipartite cavity approach for graded-activity dense associative memory models, with the Hopfield model as a special case, and solve them using a numerical scheme. We show that patterns can be transiently retrieved with high accuracy above capacity despite the absence of stable attractors. This occurs because slow regions persist in the above-capacity energy landscape as shallow, unstable remnants of below-capacity stable basins. The same transient-retrieval effect occurs in below-capacity networks initialized outside basins of attraction. "Transient-recovery curves" provide a concise visual summary of these effects, revealing graceful, non-catastrophic changes in retrieval behavior above capacity and allowing us to compare the behavior across interaction orders. This dynamical perspective reveals rich energy landscape structure obscured by equilibrium analysis and suggests biological neural circuits may exploit transient dynamics for memory retrieval. Furthermore, our approach suggests ways of understanding computational properties of neural circuits without reference to fixed points, advances the technical repertoire of numerical mean-field solution methods for recurrent neural networks, and yields new theoretical results on generalizations of the Hopfield model.
△ Less
Submitted 5 June, 2025;
originally announced June 2025.
-
Two for the Price of One: Integrating Large Language Models to Learn Biophysical Interactions
Authors:
Joseph D. Clark,
Tanner J. Dean,
Diwakar Shukla
Abstract:
Deep learning models have become fundamental tools in drug design. In particular, large language models trained on biochemical sequences learn feature vectors that guide drug discovery through virtual screening. However, such models do not capture the molecular interactions important for binding affinity and specificity. Therefore, there is a need to 'compose' representations from distinct biologi…
▽ More
Deep learning models have become fundamental tools in drug design. In particular, large language models trained on biochemical sequences learn feature vectors that guide drug discovery through virtual screening. However, such models do not capture the molecular interactions important for binding affinity and specificity. Therefore, there is a need to 'compose' representations from distinct biological modalities to effectively represent molecular complexes. We present an overview of the methods to combine molecular representations and propose that future work should balance computational efficiency and expressiveness. Specifically, we argue that improvements in both speed and accuracy are possible by learning to merge the representations from internal layers of domain specific biological language models. We demonstrate that 'composing' biochemical language models performs similar or better than standard methods representing molecular interactions despite having significantly fewer features. Finally, we discuss recent methods for interpreting and democratizing large language models that could aid the development of interaction aware foundation models for biology, as well as their shortcomings.
△ Less
Submitted 26 March, 2025;
originally announced March 2025.
-
Simplified derivations for high-dimensional convex learning problems
Authors:
David G. Clark,
Haim Sompolinsky
Abstract:
Statistical-physics calculations in machine learning and theoretical neuroscience often involve lengthy derivations that obscure physical interpretation. We present concise, non-replica derivations of key results and highlight their underlying similarities. Using a cavity approach, we analyze high-dimensional learning problems: perceptron classification of points and manifolds, and kernel ridge re…
▽ More
Statistical-physics calculations in machine learning and theoretical neuroscience often involve lengthy derivations that obscure physical interpretation. We present concise, non-replica derivations of key results and highlight their underlying similarities. Using a cavity approach, we analyze high-dimensional learning problems: perceptron classification of points and manifolds, and kernel ridge regression. These problems share a common structure--a bipartite system of interacting feature and datum variables--enabling a unified analysis. For perceptron-capacity problems, we identify a symmetry that allows derivation of correct capacities through a naïve method.
△ Less
Submitted 10 February, 2025; v1 submitted 1 December, 2024;
originally announced December 2024.
-
Connectivity structure and dynamics of nonlinear recurrent neural networks
Authors:
David G. Clark,
Owen Marschall,
Alexander van Meegen,
Ashok Litwin-Kumar
Abstract:
We develop a theory to analyze how structure in connectivity shapes the high-dimensional, internally generated activity of nonlinear recurrent neural networks. Using two complementary methods -- a path-integral calculation of fluctuations around the saddle point, and a recently introduced two-site cavity approach -- we derive analytic expressions that characterize important features of collective…
▽ More
We develop a theory to analyze how structure in connectivity shapes the high-dimensional, internally generated activity of nonlinear recurrent neural networks. Using two complementary methods -- a path-integral calculation of fluctuations around the saddle point, and a recently introduced two-site cavity approach -- we derive analytic expressions that characterize important features of collective activity, including its dimensionality and temporal correlations. To model structure in the coupling matrices of real neural circuits, such as synaptic connectomes obtained through electron microscopy, we introduce the random-mode model, which parameterizes a coupling matrix using random input and output modes and a specified spectrum. This model enables systematic study of the effects of low-dimensional structure in connectivity on neural activity. These effects manifest in features of collective activity, that we calculate, and can be undetectable when analyzing only single-neuron activities. We derive a relation between the effective rank of the coupling matrix and the dimension of activity. By extending the random-mode model, we compare the effects of single-neuron heterogeneity and low-dimensional connectivity. We also investigate the impact of structured overlaps between input and output modes, a feature of biological coupling matrices. Our theory provides tools to relate neural-network architecture and collective dynamics in artificial and biological systems.
△ Less
Submitted 3 September, 2024;
originally announced September 2024.
-
Substrate Prediction for RiPP Biosynthetic Enzymes via Masked Language Modeling and Transfer Learning
Authors:
Joseph D. Clark,
Xuenan Mi,
Douglas A. Mitchell,
Diwakar Shukla
Abstract:
Ribosomally synthesized and post-translationally modified peptide (RiPP) biosynthetic enzymes often exhibit promiscuous substrate preferences that cannot be reduced to simple rules. Large language models are promising tools for predicting such peptide fitness landscapes. However, state-of-the-art protein language models are trained on relatively few peptide sequences. A previous study comprehensiv…
▽ More
Ribosomally synthesized and post-translationally modified peptide (RiPP) biosynthetic enzymes often exhibit promiscuous substrate preferences that cannot be reduced to simple rules. Large language models are promising tools for predicting such peptide fitness landscapes. However, state-of-the-art protein language models are trained on relatively few peptide sequences. A previous study comprehensively profiled the peptide substrate preferences of LazBF (a two-component serine dehydratase) and LazDEF (a three-component azole synthetase) from the lactazole biosynthetic pathway. We demonstrated that masked language modeling of LazBF substrate preferences produced language model embeddings that improved downstream classification models of both LazBF and LazDEF substrates. Similarly, masked language modeling of LazDEF substrate preferences produced embeddings that improved the performance of classification models of both LazBF and LazDEF substrates. Our results suggest that the models learned functional forms that are transferable between distinct enzymatic transformations that act within the same biosynthetic pathway. Our transfer learning method improved performance and data efficiency in data-scarce scenarios. We then fine-tuned models on each data set and showed that the fine-tuned models provided interpretable insight that we anticipate will facilitate the design of substrate libraries that are compatible with desired RiPP biosynthetic pathways.
△ Less
Submitted 23 February, 2024;
originally announced February 2024.
-
Structure of activity in multiregion recurrent neural networks
Authors:
David G. Clark,
Manuel Beiran
Abstract:
Neural circuits comprise multiple interconnected regions, each with complex dynamics. The interplay between local and global activity is thought to underlie computational flexibility, yet the structure of multiregion neural activity and its origins in synaptic connectivity remain poorly understood. We investigate recurrent neural networks with multiple regions, each containing neurons with random…
▽ More
Neural circuits comprise multiple interconnected regions, each with complex dynamics. The interplay between local and global activity is thought to underlie computational flexibility, yet the structure of multiregion neural activity and its origins in synaptic connectivity remain poorly understood. We investigate recurrent neural networks with multiple regions, each containing neurons with random and structured connections. Inspired by experimental evidence of communication subspaces, we use low-rank connectivity between regions to enable selective activity routing. These networks exhibit high-dimensional fluctuations within regions and low-dimensional signal transmission between them. Using dynamical mean-field theory, with cross-region currents as order parameters, we show that regions act as both generators and transmitters of activity -- roles that are often in tension. Taming within-region activity can be crucial for effective signal routing. Unlike previous models that suppressed neural activity to control signal flow, our model achieves routing by exciting different high-dimensional activity patterns through connectivity structure and nonlinear dynamics. Our analysis offers insights into multiregion neural data and trained neural networks.
△ Less
Submitted 8 January, 2025; v1 submitted 19 February, 2024;
originally announced February 2024.
-
Theory of coupled neuronal-synaptic dynamics
Authors:
David G. Clark,
L. F. Abbott
Abstract:
In neural circuits, synaptic strengths influence neuronal activity by shaping network dynamics, and neuronal activity influences synaptic strengths through activity-dependent plasticity. Motivated by this fact, we study a recurrent-network model in which neuronal units and synaptic couplings are interacting dynamic variables, with couplings subject to Hebbian modification with decay around quenche…
▽ More
In neural circuits, synaptic strengths influence neuronal activity by shaping network dynamics, and neuronal activity influences synaptic strengths through activity-dependent plasticity. Motivated by this fact, we study a recurrent-network model in which neuronal units and synaptic couplings are interacting dynamic variables, with couplings subject to Hebbian modification with decay around quenched random strengths. Rather than assigning a specific role to the plasticity, we use dynamical mean-field theory and other techniques to systematically characterize the neuronal-synaptic dynamics, revealing a rich phase diagram. Adding Hebbian plasticity slows activity in chaotic networks and can induce chaos in otherwise quiescent networks. Anti-Hebbian plasticity quickens activity and produces an oscillatory component. Analysis of the Jacobian shows that Hebbian and anti-Hebbian plasticity push locally unstable modes toward the real and imaginary axes, explaining these behaviors. Both random-matrix and Lyapunov analysis show that strong Hebbian plasticity segregates network timescales into two bands with a slow, synapse-dominated band driving the dynamics, suggesting a flipped view of the network as synapses connected by neurons. For increasing strength, Hebbian plasticity initially raises the complexity of the dynamics, measured by the maximum Lyapunov exponent and attractor dimension, but then decreases these metrics, likely due to the proliferation of stable fixed points. We compute the marginally stable spectra of such fixed points as well as their number, showing exponential growth with network size. In chaotic states with strong Hebbian plasticity, a stable fixed point of neuronal dynamics is destabilized by synaptic dynamics, allowing any neuronal state to be stored as a stable fixed point by halting the plasticity. This phase of freezable chaos offers a new mechanism for working memory.
△ Less
Submitted 10 January, 2024; v1 submitted 17 February, 2023;
originally announced February 2023.
-
Dimension of activity in random neural networks
Authors:
David G. Clark,
L. F. Abbott,
Ashok Litwin-Kumar
Abstract:
Neural networks are high-dimensional nonlinear dynamical systems that process information through the coordinated activity of many connected units. Understanding how biological and machine-learning networks function and learn requires knowledge of the structure of this coordinated activity, information contained, for example, in cross covariances between units. Self-consistent dynamical mean field…
▽ More
Neural networks are high-dimensional nonlinear dynamical systems that process information through the coordinated activity of many connected units. Understanding how biological and machine-learning networks function and learn requires knowledge of the structure of this coordinated activity, information contained, for example, in cross covariances between units. Self-consistent dynamical mean field theory (DMFT) has elucidated several features of random neural networks -- in particular, that they can generate chaotic activity -- however, a calculation of cross covariances using this approach has not been provided. Here, we calculate cross covariances self-consistently via a two-site cavity DMFT. We use this theory to probe spatiotemporal features of activity coordination in a classic random-network model with independent and identically distributed (i.i.d.) couplings, showing an extensive but fractionally low effective dimension of activity and a long population-level timescale. Our formulae apply to a wide range of single-unit dynamics and generalize to non-i.i.d. couplings. As an example of the latter, we analyze the case of partially symmetric couplings.
△ Less
Submitted 11 September, 2023; v1 submitted 25 July, 2022;
originally announced July 2022.
-
Parallel locomotor control strategies in mice and flies
Authors:
Ana I. Gonçalves,
Jacob A. Zavatone-Veth,
Megan R. Carey,
Damon A. Clark
Abstract:
Our understanding of the neural basis of locomotor behavior can be informed by careful quantification of animal movement. Classical descriptions of legged locomotion have defined discrete locomotor gaits, characterized by distinct patterns of limb movement. Recent technical advances have enabled increasingly detailed characterization of limb kinematics across many species, imposing tighter constra…
▽ More
Our understanding of the neural basis of locomotor behavior can be informed by careful quantification of animal movement. Classical descriptions of legged locomotion have defined discrete locomotor gaits, characterized by distinct patterns of limb movement. Recent technical advances have enabled increasingly detailed characterization of limb kinematics across many species, imposing tighter constraints on neural control. Here, we highlight striking similarities between coordination patterns observed in two genetic model organisms: the laboratory mouse and Drosophila. Both species exhibit continuously-variable coordination patterns with similar low-dimensional structure, suggesting shared principles for limb coordination and descending neural control.
△ Less
Submitted 22 December, 2021;
originally announced December 2021.
-
Credit Assignment Through Broadcasting a Global Error Vector
Authors:
David G. Clark,
L. F. Abbott,
SueYeon Chung
Abstract:
Backpropagation (BP) uses detailed, unit-specific feedback to train deep neural networks (DNNs) with remarkable success. That biological neural circuits appear to perform credit assignment, but cannot implement BP, implies the existence of other powerful learning algorithms. Here, we explore the extent to which a globally broadcast learning signal, coupled with local weight updates, enables traini…
▽ More
Backpropagation (BP) uses detailed, unit-specific feedback to train deep neural networks (DNNs) with remarkable success. That biological neural circuits appear to perform credit assignment, but cannot implement BP, implies the existence of other powerful learning algorithms. Here, we explore the extent to which a globally broadcast learning signal, coupled with local weight updates, enables training of DNNs. We present both a learning rule, called global error-vector broadcasting (GEVB), and a class of DNNs, called vectorized nonnegative networks (VNNs), in which this learning rule operates. VNNs have vector-valued units and nonnegative weights past the first layer. The GEVB learning rule generalizes three-factor Hebbian learning, updating each weight by an amount proportional to the inner product of the presynaptic activation and a globally broadcast error vector when the postsynaptic unit is active. We prove that these weight updates are matched in sign to the gradient, enabling accurate credit assignment. Moreover, at initialization, these updates are exactly proportional to the gradient in the limit of infinite network width. GEVB matches the performance of BP in VNNs, and in some cases outperforms direct feedback alignment (DFA) applied in conventional networks. Unlike DFA, GEVB successfully trains convolutional layers. Altogether, our theoretical and empirical results point to a surprisingly powerful role for a global learning signal in training DNNs.
△ Less
Submitted 28 October, 2021; v1 submitted 8 June, 2021;
originally announced June 2021.
-
Simple Records Support Robust Indirect Reciprocity
Authors:
Daniel Clark,
Drew Fudenberg,
Alexander Wolitzky
Abstract:
Indirect reciprocity is a foundational mechanism of human cooperation. Existing models of indirect reciprocity fail to robustly support social cooperation: image scoring models fail to provide robust incentives, while social standing models are not informationally robust. Here we provide a new model of indirect reciprocity based on simple, decentralized records: each individual's record depends on…
▽ More
Indirect reciprocity is a foundational mechanism of human cooperation. Existing models of indirect reciprocity fail to robustly support social cooperation: image scoring models fail to provide robust incentives, while social standing models are not informationally robust. Here we provide a new model of indirect reciprocity based on simple, decentralized records: each individual's record depends on their own past behavior alone, and not on their partners' past behavior or their partners' partners' past behavior. When social dilemmas exhibit a coordination motive (or strategic complementarity), tolerant trigger strategies based on simple records can robustly support positive social cooperation and exhibit strong stability properties. In the opposite case of strategic substitutability, positive social cooperation cannot be robustly supported. Thus, the strength of short-run coordination motives in social dilemmas determines the prospects for robust long-run cooperation.
△ Less
Submitted 9 October, 2019;
originally announced October 2019.
-
Optimizing Differential Identifiability Improves Connectome Predictive Modeling of Cognitive Deficits in Alzheimer's Disease
Authors:
Diana O. Svaldi,
Joaquín Goñi,
Kausar Abbas,
Enrico Amico,
David G. Clark,
Charanya Muralidharan,
Mario Dzemidzic,
John D. West,
Shannon L. Risacher,
Andrew J. Saykin,
Liana G. Apostolova
Abstract:
Functional connectivity, as estimated using resting state fMRI, has shown potential in bridging the gap between pathophysiology and cognition. However, clinical use of functional connectivity biomarkers is impeded by unreliable estimates of individual functional connectomes and lack of generalizability of models predicting cognitive outcomes from connectivity. To address these issues, we combine t…
▽ More
Functional connectivity, as estimated using resting state fMRI, has shown potential in bridging the gap between pathophysiology and cognition. However, clinical use of functional connectivity biomarkers is impeded by unreliable estimates of individual functional connectomes and lack of generalizability of models predicting cognitive outcomes from connectivity. To address these issues, we combine the frameworks of connectome predictive modeling and differential identifiability. Using the combined framework, we show that enhancing the individual fingerprint of resting state functional connectomes leads to robust identification of functional networks associated to cognitive outcomes and also improves prediction of cognitive outcomes from functional connectomes. Using a comprehensive spectrum of cognitive outcomes associated to Alzheimer's disease, we identify and characterize functional networks associated to specific cognitive deficits exhibited in Alzheimer's disease. This combined framework is an important step in making individual level predictions of cognition from resting state functional connectomes and in understanding the relationship between cognition and connectivity.
△ Less
Submitted 12 December, 2019; v1 submitted 16 August, 2019;
originally announced August 2019.
-
Spiking Linear Dynamical Systems on Neuromorphic Hardware for Low-Power Brain-Machine Interfaces
Authors:
David G. Clark,
Jesse A. Livezey,
Edward F. Chang,
Kristofer E. Bouchard
Abstract:
Neuromorphic architectures achieve low-power operation by using many simple spiking neurons in lieu of traditional hardware. Here, we develop methods for precise linear computations in spiking neural networks and use these methods to map the evolution of a linear dynamical system (LDS) onto an existing neuromorphic chip: IBM's TrueNorth. We analytically characterize, and numerically validate, the…
▽ More
Neuromorphic architectures achieve low-power operation by using many simple spiking neurons in lieu of traditional hardware. Here, we develop methods for precise linear computations in spiking neural networks and use these methods to map the evolution of a linear dynamical system (LDS) onto an existing neuromorphic chip: IBM's TrueNorth. We analytically characterize, and numerically validate, the discrepancy between the spiking LDS state sequence and that of its non-spiking counterpart. These analytical results shed light on the multiway tradeoff between time, space, energy, and accuracy in neuromorphic computation. To demonstrate the utility of our work, we implemented a neuromorphic Kalman filter (KF) and used it for offline decoding of human vocal pitch from neural data. The neuromorphic KF could be used for low-power filtering in domains beyond neuroscience, such as navigation or robotics.
△ Less
Submitted 5 June, 2018; v1 submitted 22 May, 2018;
originally announced May 2018.
-
Prefer Nested Segmentation to Compound Segmentation
Authors:
Haley D. Clark,
Stefan A. Reinsberg,
Vitali Moiseenko,
Jonn Wu,
Steven D. Thomas
Abstract:
Introduction: Intra-organ radiation dose sensitivity is becoming increasingly relevant in clinical radiotherapy. One method for assessment involves partitioning delineated regions of interest and comparing the relative contributions or importance to clinical outcomes. We show that an intuitive method for dividing organ contours, compound (sub-)segmentation, can unintentionally lead to sub-segments…
▽ More
Introduction: Intra-organ radiation dose sensitivity is becoming increasingly relevant in clinical radiotherapy. One method for assessment involves partitioning delineated regions of interest and comparing the relative contributions or importance to clinical outcomes. We show that an intuitive method for dividing organ contours, compound (sub-)segmentation, can unintentionally lead to sub-segments with inconsistent volumes, which will bias relative importance assessment. An improved technique, nested segmentation, is introduced and compared. Methods: Clinical radiotherapy planning parotid contours from 510 patients were segmented. Counts of radiotherapy dose matrix voxels interior to sub-segments were used to determine the equivalency of sub-segment volumes. The distribution of voxel counts within sub-segments were compared using Kolmogorov-Smirnov tests and characterized by their dispersion. Analytical solutions for 2D/3D analogues were derived and sub-segment area/volume were compared directly. Results: Both parotid and 2D/3D region of interest analogue segmentation confirmed compound segmentation intrinsically produces sub-segments with volumes that depend on the region of interest shape and selection location. Significant volume differences were observed when sub-segmenting parotid contours into 18ths, and vanishingly small sub-segments were observed when sub-segmenting into 96ths. Central sub-segments were considerably smaller than sub-segments on the periphery. Nested segmentation did not exhibit these shortcomings and produced sub-segments with equivalent volumes when dose grid and contour collinearity was addressed, even when dividing the parotid into 96ths. Nested segmentation was always faster or equivalent in runtime to compound segmentation. Conclusions: Nested segmentation is more suited than compound segmentation for analyses requiring equal weighting of sub-segments.
△ Less
Submitted 3 May, 2017;
originally announced May 2017.
-
The Bacterial Chemotactic Response Reflects a Compromise Between Transient and Steady State Behavior
Authors:
Damon A. Clark,
Lars C. Grant
Abstract:
Swimming bacteria detect chemical gradients by performing temporal comparisons of recent measurements of chemical concentration. These comparisons are described quantitatively by the chemotactic response function, which we expect to optimize chemotactic behavioral performance. We identify two independent chemotactic performance criteria: in the short run, a favorable response function should mov…
▽ More
Swimming bacteria detect chemical gradients by performing temporal comparisons of recent measurements of chemical concentration. These comparisons are described quantitatively by the chemotactic response function, which we expect to optimize chemotactic behavioral performance. We identify two independent chemotactic performance criteria: in the short run, a favorable response function should move bacteria up chemoattractant gradients, while in the long run, bacteria should aggregate at peaks of chemoattractant concentration. Surprisingly, these two criteria conflict, so that when one performance criterion is most favorable, the other is unfavorable. Since both types of behavior are biologically relevant, we include both behaviors in a composite optimization that yields a response function that closely resembles experimental measurements. Our work suggests that the bacterial chemotactic response function can be derived from simple behavioral considerations, and sheds light on how the response function contributes to chemotactic performance.
△ Less
Submitted 2 February, 2006;
originally announced February 2006.