-
A Lyapunov theory demonstrating a fundamental limit on the speed of systems consolidation
Authors:
Alireza Alemi,
Emre R. F. Aksay,
Mark S. Goldman
Abstract:
The nervous system reorganizes memories from an early site to a late site, a commonly observed feature of learning and memory systems known as systems consolidation. Previous work has suggested learning rules by which consolidation may occur. Here, we provide conditions under which such rules are guaranteed to lead to stable convergence of learning and consolidation. We use the theory of Lyapunov…
▽ More
The nervous system reorganizes memories from an early site to a late site, a commonly observed feature of learning and memory systems known as systems consolidation. Previous work has suggested learning rules by which consolidation may occur. Here, we provide conditions under which such rules are guaranteed to lead to stable convergence of learning and consolidation. We use the theory of Lyapunov functions, which enforces stability by requiring learning rules to decrease an energy-like (Lyapunov) function. We present the theory in the context of a simple circuit architecture motivated by classic models of learning in systems consolidation mediated by the cerebellum. Stability is only guaranteed if the learning rate in the late stage is not faster than the learning rate in the early stage. Further, the slower the learning rate at the late stage, the larger the perturbation the system can tolerate with a guarantee of stability. We provide intuition for this result by mapping the consolidation model to a damped driven oscillator system, and showing that the ratio of early- to late-stage learning rates in the consolidation model can be directly identified with the (square of the) oscillator's damping ratio. This work suggests the power of the Lyapunov approach to provide constraints on nervous system function.
△ Less
Submitted 7 February, 2025; v1 submitted 2 February, 2024;
originally announced February 2024.
-
The brain as an efficient and robust adaptive learner
Authors:
Sophie Denève,
Alireza Alemi,
Ralph Bourdoukan
Abstract:
Understanding how the brain learns to compute functions reliably, efficiently and robustly with noisy spiking activity is a fundamental challenge in neuroscience. Most sensory and motor tasks can be described as dynamical systems and could presumably be learned by adjusting connection weights in a recurrent biological neural network. However, this is greatly complicated by the credit assignment pr…
▽ More
Understanding how the brain learns to compute functions reliably, efficiently and robustly with noisy spiking activity is a fundamental challenge in neuroscience. Most sensory and motor tasks can be described as dynamical systems and could presumably be learned by adjusting connection weights in a recurrent biological neural network. However, this is greatly complicated by the credit assignment problem for learning in recurrent network, e.g. the contribution of each connection to the global output error cannot be determined based only on locally accessible quantities to the synapse. Combining tools from adaptive control theory and efficient coding theories, we propose that neural circuits can indeed learn complex dynamic tasks with local synaptic plasticity rules as long as they associate two experimentally established neural mechanisms. First, they should receive top-down feedbacks driving both their activity and their synaptic plasticity. Second, inhibitory interneurons should maintain a tight balance between excitation and inhibition in the circuit. The resulting networks could learn arbitrary dynamical systems and produce irregular spike trains as variable as those observed experimentally. Yet, this variability in single neurons may hide an extremely efficient and robust computation at the population level.
△ Less
Submitted 22 May, 2017;
originally announced May 2017.
-
Learning arbitrary dynamics in efficient, balanced spiking networks using local plasticity rules
Authors:
Alireza Alemi,
Christian Machens,
Sophie Denève,
Jean-Jacques Slotine
Abstract:
Understanding how recurrent neural circuits can learn to implement dynamical systems is a fundamental challenge in neuroscience. The credit assignment problem, i.e. determining the local contribution of each synapse to the network's global output error, is a major obstacle in deriving biologically plausible local learning rules. Moreover, spiking recurrent networks implementing such tasks should n…
▽ More
Understanding how recurrent neural circuits can learn to implement dynamical systems is a fundamental challenge in neuroscience. The credit assignment problem, i.e. determining the local contribution of each synapse to the network's global output error, is a major obstacle in deriving biologically plausible local learning rules. Moreover, spiking recurrent networks implementing such tasks should not be hugely costly in terms of number of neurons and spikes, as they often are when adapted from rate models. Finally, these networks should be robust to noise and neural deaths in order to sustain these representations in the face of such naturally occurring perturbation. We approach this problem by fusing the theory of efficient, balanced spiking networks (EBN) with nonlinear adaptive control theory. Local learning rules are ensured by feeding back into the network its own error, resulting in a synaptic plasticity rule depending solely on presynaptic inputs and post-synaptic feedback. The spiking efficiency and robustness of the network are guaranteed by maintaining a tight excitatory/inhibitory balance, ensuring that each spike represents a local projection of the global output error and minimizes a loss function. The resulting networks can learn to implement complex dynamics with very small numbers of neurons and spikes, exhibit the same spike train variability as observed experimentally, and are extremely robust to noise and neuronal loss.
△ Less
Submitted 4 August, 2017; v1 submitted 22 May, 2017;
originally announced May 2017.
-
Exponential Capacity in an Autoencoder Neural Network with a Hidden Layer
Authors:
Alireza Alemi,
Alia Abbara
Abstract:
A fundamental aspect of limitations in learning any computation in neural architectures is characterizing their optimal capacities.
An important, widely-used neural architecture is known as autoencoders where the network reconstructs the input at the output layer via a representation at a hidden layer.
Even though capacities of several neural architectures have been addressed using statistical…
▽ More
A fundamental aspect of limitations in learning any computation in neural architectures is characterizing their optimal capacities.
An important, widely-used neural architecture is known as autoencoders where the network reconstructs the input at the output layer via a representation at a hidden layer.
Even though capacities of several neural architectures have been addressed using statistical physics methods, the capacity of autoencoder neural networks is not well-explored.
Here, we analytically show that an autoencoder network of binary neurons with a hidden layer can achieve a capacity that grows exponentially with network size.
The network has fixed random weights encoding a set of dense input patterns into a dense, expanded (or \emph{overcomplete}) hidden layer representation. A set of learnable weights decodes the input patters at the output layer. We perform a mean-field approximation of the model to reduce the model to a perceptron problem with an input-output dependency. Carrying out Gardner's \emph{replica} calculation, we show that as the expansion ratio, defined as the number of hidden units over the number of input units, increases, the autoencoding capacity grows exponentially even when the sparseness or the coding level of the hidden layer representation is changed. The replica-symmetric solution is locally stable and is in good agreement with simulation results obtained using a local learning rule. In addition, the degree of symmetry between the encoding and decoding weights monotonically increases with the expansion ratio.
△ Less
Submitted 21 May, 2017;
originally announced May 2017.
-
An attractor neural network architecture with an ultra high information capacity: numerical results
Authors:
Alireza Alemi
Abstract:
Attractor neural network is an important theoretical scenario for modeling memory function in the hippocampus and in the cortex. In these models, memories are stored in the plastic recurrent connections of neural populations in the form of "attractor states". The maximal information capacity for conventional abstract attractor networks with unconstrained connections is 2 bits/synapse. However, an…
▽ More
Attractor neural network is an important theoretical scenario for modeling memory function in the hippocampus and in the cortex. In these models, memories are stored in the plastic recurrent connections of neural populations in the form of "attractor states". The maximal information capacity for conventional abstract attractor networks with unconstrained connections is 2 bits/synapse. However, an unconstrained synapse has the capacity to store infinite amount of bits in a noiseless theoretical scenario: a capacity that conventional attractor networks cannot achieve. Here, I propose a hierarchical attractor network that can achieve an ultra high information capacity. The network has two layers: a visible layer with $N_v$ neurons, and a hidden layer with $N_h$ neurons. The visible-to-hidden connections are set at random and kept fixed during the training phase, in which the memory patterns are stored as fixed-points of the network dynamics. The hidden-to-visible connections, initially normally distributed, are learned via a local, online learning rule called the Three-Threshold Learning Rule and there is no within-layer connections. The results of simulations suggested that the maximal information capacity grows exponentially with the expansion ratio $N_h/N_v$. As a first order approximation to understand the mechanism providing the high capacity, I simulated a naive mean-field approximation (nMFA) of the network. The exponential increase was captured by the nMFA, revealing that a key underlying factor is the correlation between the hidden and the visible units. Additionally, it was observed that, at maximal capacity, the degree of symmetry of the connectivity between the hidden and the visible neurons increases with the expansion ratio. These results highlight the role of hierarchical architecture in remarkably increasing the performance of information storage in attractor networks.
△ Less
Submitted 10 January, 2016; v1 submitted 3 December, 2015;
originally announced December 2015.
-
A three-threshold learning rule approaches the maximal capacity of recurrent neural networks
Authors:
Alireza Alemi,
Carlo Baldassi,
Nicolas Brunel,
Riccardo Zecchina
Abstract:
Understanding the theoretical foundations of how memories are encoded and retrieved in neural populations is a central challenge in neuroscience. A popular theoretical scenario for modeling memory function is the attractor neural network scenario, whose prototype is the Hopfield model. The model has a poor storage capacity, compared with the capacity achieved with perceptron learning algorithms. H…
▽ More
Understanding the theoretical foundations of how memories are encoded and retrieved in neural populations is a central challenge in neuroscience. A popular theoretical scenario for modeling memory function is the attractor neural network scenario, whose prototype is the Hopfield model. The model has a poor storage capacity, compared with the capacity achieved with perceptron learning algorithms. Here, by transforming the perceptron learning rule, we present an online learning rule for a recurrent neural network that achieves near-maximal storage capacity without an explicit supervisory error signal, relying only upon locally accessible information. The fully-connected network consists of excitatory binary neurons with plastic recurrent connections and non-plastic inhibitory feedback stabilizing the network dynamics; the memory patterns are presented online as strong afferent currents, producing a bimodal distribution for the neuron synaptic inputs. Synapses corresponding to active inputs are modified as a function of the value of the local fields with respect to three thresholds. Above the highest threshold, and below the lowest threshold, no plasticity occurs. In between these two thresholds, potentiation/depression occurs when the local field is above/below an intermediate threshold. We simulated and analyzed a network of binary neurons implementing this rule and measured its storage capacity for different sizes of the basins of attraction. The storage capacity obtained through numerical simulations is shown to be close to the value predicted by analytical calculations. We also measured the dependence of capacity on the strength of external inputs. Finally, we quantified the statistics of the resulting synaptic connectivity matrix, and found that both the fraction of zero weight synapses and the degree of symmetry of the weight matrix increase with the number of stored patterns.
△ Less
Submitted 3 August, 2015;
originally announced August 2015.
-
You Can Run, You Can Hide: The Epidemiology and Statistical Mechanics of Zombies
Authors:
Alexander A. Alemi,
Matthew Bierbaum,
Christopher R. Myers,
James P. Sethna
Abstract:
We use a popular fictional disease, zombies, in order to introduce techniques used in modern epidemiology modelling, and ideas and techniques used in the numerical study of critical phenomena. We consider variants of zombie models, from fully connected continuous time dynamics to a full scale exact stochastic dynamic simulation of a zombie outbreak on the continental United States. Along the way,…
▽ More
We use a popular fictional disease, zombies, in order to introduce techniques used in modern epidemiology modelling, and ideas and techniques used in the numerical study of critical phenomena. We consider variants of zombie models, from fully connected continuous time dynamics to a full scale exact stochastic dynamic simulation of a zombie outbreak on the continental United States. Along the way, we offer a closed form analytical expression for the fully connected differential equation, and demonstrate that the single person per site two dimensional square lattice version of zombies lies in the percolation universality class. We end with a quantitative study of the full scale US outbreak, including the average susceptibility of different geographical regions.
△ Less
Submitted 4 June, 2015; v1 submitted 3 March, 2015;
originally announced March 2015.
-
Mechanical Properties of Growing Melanocytic Nevi and the Progression to Melanoma
Authors:
Alessandro Taloni,
Alexander A. Alemi,
Emilio Ciusani,
James P. Sethna,
Stefano Zapperi,
Caterina A. M. La Porta
Abstract:
Melanocytic nevi are benign proliferations that sometimes turn into malignant melanoma in a way that is still unclear from the biochemical and genetic point of view. Diagnostic and prognostic tools are then mostly based on dermoscopic examination and morphological analysis of histological tissues. To investigate the role of mechanics and geometry in the morpholgical dynamics of melanocytic nevi, w…
▽ More
Melanocytic nevi are benign proliferations that sometimes turn into malignant melanoma in a way that is still unclear from the biochemical and genetic point of view. Diagnostic and prognostic tools are then mostly based on dermoscopic examination and morphological analysis of histological tissues. To investigate the role of mechanics and geometry in the morpholgical dynamics of melanocytic nevi, we study a computation model for cell proliferation in a layered non-linear elastic tissue. Numerical simulations suggest that the morphology of the nevus is correlated to the initial location of the proliferating cell starting the growth process and to the mechanical properties of the tissue. Our results also support that melanocytes are subject to compressive stresses that fluctuate widely in the nevus and depend on the growth stage. Numerical simulations of cells in the epidermis releasing matrix metalloproteinases display an accelerated invasion of the dermis by destroying the basal membrane. Moreover, we suggest experimentally that osmotic stress and collagen inhibit growth in primary melanoma cells while the effect is much weaker in metastatic cells. Knowing that morphological features of nevi might also reflect geometry and mechanics rather than malignancy could be relevant for diagnostic purposes
△ Less
Submitted 15 April, 2014;
originally announced April 2014.
-
Growth and form of melanoma cell colonies
Authors:
Massimiliano Maria Baraldi,
Alexander A Alemi,
James P Sethna,
Sergio Caracciolo,
Caterina A M La Porta,
Stefano Zapperi
Abstract:
We study the statistical properties of melanoma cell colonies grown in vitro by analyzing the results of crystal violet assays at different concentrations of initial plated cells and for different growth times. The distribution of colony sizes is described well by a continuous time branching process. To characterize the shape fluctuations of the colonies, we compute the distribution of eccentricit…
▽ More
We study the statistical properties of melanoma cell colonies grown in vitro by analyzing the results of crystal violet assays at different concentrations of initial plated cells and for different growth times. The distribution of colony sizes is described well by a continuous time branching process. To characterize the shape fluctuations of the colonies, we compute the distribution of eccentricities. The experimental results are compared with numerical results for models of random division of elastic cells, showing that experimental results are best reproduced by restricting cell division to the outer rim of the colony. Our results serve to illustrate the wealth of information that can be extracted by a standard experimental method such as the crystal violet assay.
△ Less
Submitted 27 August, 2013;
originally announced August 2013.