-
Neural timescales from a computational perspective
Authors:
Roxana Zeraati,
Anna Levina,
Jakob H. Macke,
Richard Gao
Abstract:
Neural activity fluctuates over a wide range of timescales within and across brain areas. Experimental observations suggest that diverse neural timescales reflect information in dynamic environments. However, how timescales are defined and measured from brain recordings vary across the literature. Moreover, these observations do not specify the mechanisms underlying timescale variations, nor wheth…
▽ More
Neural activity fluctuates over a wide range of timescales within and across brain areas. Experimental observations suggest that diverse neural timescales reflect information in dynamic environments. However, how timescales are defined and measured from brain recordings vary across the literature. Moreover, these observations do not specify the mechanisms underlying timescale variations, nor whether specific timescales are necessary for neural computation and brain function. Here, we synthesize three directions where computational approaches can distill the broad set of empirical observations into quantitative and testable theories: We review (i) how different data analysis methods quantify timescales across distinct behavioral states and recording modalities, (ii) how biophysical models provide mechanistic explanations for the emergence of diverse timescales, and (iii) how task-performing networks and machine learning models uncover the functional relevance of neural timescales. This integrative computational perspective thus complements experimental investigations, providing a holistic view on how neural timescales reflect the relationship between brain structure, dynamics, and behavior.
△ Less
Submitted 12 May, 2025; v1 submitted 4 September, 2024;
originally announced September 2024.
-
Modular Growth of Hierarchical Networks: Efficient, General, and Robust Curriculum Learning
Authors:
Mani Hamidi,
Sina Khajehabdollahi,
Emmanouil Giannakakis,
Tim Schäfer,
Anna Levina,
Charley M. Wu
Abstract:
Structural modularity is a pervasive feature of biological neural networks, which have been linked to several functional and computational advantages. Yet, the use of modular architectures in artificial neural networks has been relatively limited despite early successes. Here, we explore the performance and functional dynamics of a modular network trained on a memory task via an iterative growth c…
▽ More
Structural modularity is a pervasive feature of biological neural networks, which have been linked to several functional and computational advantages. Yet, the use of modular architectures in artificial neural networks has been relatively limited despite early successes. Here, we explore the performance and functional dynamics of a modular network trained on a memory task via an iterative growth curriculum. We find that for a given classical, non-modular recurrent neural network (RNN), an equivalent modular network will perform better across multiple metrics, including training time, generalizability, and robustness to some perturbations. We further examine how different aspects of a modular network's connectivity contribute to its computational capability. We then demonstrate that the inductive bias introduced by the modular topology is strong enough for the network to perform well even when the connectivity within modules is fixed and only the connections between modules are trained. Our findings suggest that gradual modular growth of RNNs could provide advantages for learning increasingly complex tasks on evolutionary timescales, and help build more scalable and compressible artificial networks.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Learning with 3D rotations, a hitchhiker's guide to SO(3)
Authors:
A. René Geist,
Jonas Frey,
Mikel Zhobro,
Anna Levina,
Georg Martius
Abstract:
Many settings in machine learning require the selection of a rotation representation. However, choosing a suitable representation from the many available options is challenging. This paper acts as a survey and guide through rotation representations. We walk through their properties that harm or benefit deep learning with gradient-based optimization. By consolidating insights from rotation-based le…
▽ More
Many settings in machine learning require the selection of a rotation representation. However, choosing a suitable representation from the many available options is challenging. This paper acts as a survey and guide through rotation representations. We walk through their properties that harm or benefit deep learning with gradient-based optimization. By consolidating insights from rotation-based learning, we provide a comprehensive overview of learning functions with rotation representations. We provide guidance on selecting representations based on whether rotations are in the model's input or output and whether the data primarily comprises small angles.
△ Less
Submitted 19 June, 2024; v1 submitted 17 April, 2024;
originally announced April 2024.
-
Network bottlenecks and task structure control the evolution of interpretable learning rules in a foraging agent
Authors:
Emmanouil Giannakakis,
Sina Khajehabdollahi,
Anna Levina
Abstract:
Developing reliable mechanisms for continuous local learning is a central challenge faced by biological and artificial systems. Yet, how the environmental factors and structural constraints on the learning network influence the optimal plasticity mechanisms remains obscure even for simple settings. To elucidate these dependencies, we study meta-learning via evolutionary optimization of simple rewa…
▽ More
Developing reliable mechanisms for continuous local learning is a central challenge faced by biological and artificial systems. Yet, how the environmental factors and structural constraints on the learning network influence the optimal plasticity mechanisms remains obscure even for simple settings. To elucidate these dependencies, we study meta-learning via evolutionary optimization of simple reward-modulated plasticity rules in embodied agents solving a foraging task. We show that unconstrained meta-learning leads to the emergence of diverse plasticity rules. However, regularization and bottlenecks to the model help reduce this variability, resulting in interpretable rules. Our findings indicate that the meta-learning of plasticity rules is very sensitive to various parameters, with this sensitivity possibly reflected in the learning rules found in biological networks. When included in models, these dependencies can be used to discover potential objective functions and details of biological learning via comparisons with experimental observations.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Revising clustering and small-worldness in brain networks
Authors:
Tanguy Fardet,
Emmanouil Giannakakis,
Lukas Paulun,
Anna Levina
Abstract:
As more connectome data become available, the question of how to best analyse the structure of biological neural networks becomes increasingly pertinent. In brain networks, knowing that two areas are connected is often not sufficient, as the directionality and weight of the connection affect the dynamics in crucial ways. Still, the methods commonly used to estimate network properties, such as clus…
▽ More
As more connectome data become available, the question of how to best analyse the structure of biological neural networks becomes increasingly pertinent. In brain networks, knowing that two areas are connected is often not sufficient, as the directionality and weight of the connection affect the dynamics in crucial ways. Still, the methods commonly used to estimate network properties, such as clustering and small-worldness, usually disregard features encoded in the directionality and strength of network connections. To address this issue, we propose using fully-weighted and directed clustering measures that provide higher sensitivity to non-random structural features. Using artificial networks, we demonstrate the problems with methods routinely used in the field and how fully-weighted and directed methods can alleviate them. Specifically, we highlight their robustness to noise and their ability to address thresholding issues, particularly in inferred networks. We further apply our method to the connectomes of different species and uncover regularities and correlations between neuronal structures and functions that cannot be detected with traditional clustering metrics. Finally, we extend the notion of small-worldness in brain networks to account for weights and directionality and show that some connectomes can no longer be considered ``small-world''. Overall, our study makes a case for a combined use of fully-weighted and directed measures to deal with the variability of brain networks and suggests the presence of complex patterns in neural connectivity that can only be revealed using such methods.
△ Less
Submitted 28 January, 2024;
originally announced January 2024.
-
Emergent mechanisms for long timescales depend on training curriculum and affect performance in memory tasks
Authors:
Sina Khajehabdollahi,
Roxana Zeraati,
Emmanouil Giannakakis,
Tim Jakob Schäfer,
Georg Martius,
Anna Levina
Abstract:
Recurrent neural networks (RNNs) in the brain and in silico excel at solving tasks with intricate temporal dependencies. Long timescales required for solving such tasks can arise from properties of individual neurons (single-neuron timescale, $τ$, e.g., membrane time constant in biological neurons) or recurrent interactions among them (network-mediated timescale). However, the contribution of each…
▽ More
Recurrent neural networks (RNNs) in the brain and in silico excel at solving tasks with intricate temporal dependencies. Long timescales required for solving such tasks can arise from properties of individual neurons (single-neuron timescale, $τ$, e.g., membrane time constant in biological neurons) or recurrent interactions among them (network-mediated timescale). However, the contribution of each mechanism for optimally solving memory-dependent tasks remains poorly understood. Here, we train RNNs to solve $N$-parity and $N$-delayed match-to-sample tasks with increasing memory requirements controlled by $N$ by simultaneously optimizing recurrent weights and $τ$s. We find that for both tasks RNNs develop longer timescales with increasing $N$, but depending on the learning objective, they use different mechanisms. Two distinct curricula define learning objectives: sequential learning of a single-$N$ (single-head) or simultaneous learning of multiple $N$s (multi-head). Single-head networks increase their $τ$ with $N$ and are able to solve tasks for large $N$, but they suffer from catastrophic forgetting. However, multi-head networks, which are explicitly required to hold multiple concurrent memories, keep $τ$ constant and develop longer timescales through recurrent connectivity. Moreover, we show that the multi-head curriculum increases training speed and network stability to ablations and perturbations, and allows RNNs to generalize better to tasks beyond their training regime. This curriculum also significantly improves training GRUs and LSTMs for large-$N$ tasks. Our results suggest that adapting timescales to task requirements via recurrent interactions allows learning more complex objectives and improves the RNN's performance.
△ Less
Submitted 30 October, 2024; v1 submitted 22 September, 2023;
originally announced September 2023.
-
The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks
Authors:
Aaron Spieler,
Nasim Rahaman,
Georg Martius,
Bernhard Schölkopf,
Anna Levina
Abstract:
Biological cortical neurons are remarkably sophisticated computational devices, temporally integrating their vast synaptic input over an intricate dendritic tree, subject to complex, nonlinearly interacting internal biological processes. A recent study proposed to characterize this complexity by fitting accurate surrogate models to replicate the input-output relationship of a detailed biophysical…
▽ More
Biological cortical neurons are remarkably sophisticated computational devices, temporally integrating their vast synaptic input over an intricate dendritic tree, subject to complex, nonlinearly interacting internal biological processes. A recent study proposed to characterize this complexity by fitting accurate surrogate models to replicate the input-output relationship of a detailed biophysical cortical pyramidal neuron model and discovered it needed temporal convolutional networks (TCN) with millions of parameters. Requiring these many parameters, however, could stem from a misalignment between the inductive biases of the TCN and cortical neuron's computations. In light of this, and to explore the computational implications of leaky memory units and nonlinear dendritic processing, we introduce the Expressive Leaky Memory (ELM) neuron model, a biologically inspired phenomenological model of a cortical neuron. Remarkably, by exploiting such slowly decaying memory-like hidden states and two-layered nonlinear integration of synaptic input, our ELM neuron can accurately match the aforementioned input-output relationship with under ten thousand trainable parameters. To further assess the computational ramifications of our neuron design, we evaluate it on various tasks with demanding temporal structures, including the Long Range Arena (LRA) datasets, as well as a novel neuromorphic dataset based on the Spiking Heidelberg Digits dataset (SHD-Adding). Leveraging a larger number of memory units with sufficiently long timescales, and correspondingly sophisticated synaptic integration, the ELM neuron displays substantial long-range processing capabilities, reliably outperforming the classic Transformer or Chrono-LSTM architectures on LRA, and even solving the Pathfinder-X task with over 70% accuracy (16k context length).
△ Less
Submitted 17 March, 2024; v1 submitted 14 June, 2023;
originally announced June 2023.
-
Locally adaptive cellular automata for goal-oriented self-organization
Authors:
Sina Khajehabdollahi,
Emmanouil Giannakakis,
Victor Buendia,
Georg Martius,
Anna Levina
Abstract:
The essential ingredient for studying the phenomena of emergence is the ability to generate and manipulate emergent systems that span large scales. Cellular automata are the model class particularly known for their effective scalability but are also typically constrained by fixed local rules. In this paper, we propose a new model class of adaptive cellular automata that allows for the generation o…
▽ More
The essential ingredient for studying the phenomena of emergence is the ability to generate and manipulate emergent systems that span large scales. Cellular automata are the model class particularly known for their effective scalability but are also typically constrained by fixed local rules. In this paper, we propose a new model class of adaptive cellular automata that allows for the generation of scalable and expressive models. We show how to implement computation-effective adaptation by coupling the update rule of the cellular automaton with itself and the system state in a localized way. To demonstrate the applications of this approach, we implement two different emergent models: a self-organizing Ising model and two types of plastic neural networks, a rate and spiking model. With the Ising model, we show how coupling local/global temperatures to local/global measurements can tune the model to stay in the vicinity of the critical temperature. With the neural models, we reproduce a classical balanced state in large recurrent neuronal networks with excitatory and inhibitory neurons and various plasticity mechanisms. Our study opens multiple directions for studying collective behavior and emergence.
△ Less
Submitted 12 June, 2023;
originally announced June 2023.
-
When to be critical? Performance and evolvability in different regimes of neural Ising agents
Authors:
Sina Khajehabdollahi,
Jan Prosi,
Emmanouil Giannakakis,
Georg Martius,
Anna Levina
Abstract:
It has long been hypothesized that operating close to the critical state is beneficial for natural, artificial and their evolutionary systems. We put this hypothesis to test in a system of evolving foraging agents controlled by neural networks that can adapt agents' dynamical regime throughout evolution. Surprisingly, we find that all populations that discover solutions, evolve to be subcritical.…
▽ More
It has long been hypothesized that operating close to the critical state is beneficial for natural, artificial and their evolutionary systems. We put this hypothesis to test in a system of evolving foraging agents controlled by neural networks that can adapt agents' dynamical regime throughout evolution. Surprisingly, we find that all populations that discover solutions, evolve to be subcritical. By a resilience analysis, we find that there are still benefits of starting the evolution in the critical regime. Namely, initially critical agents maintain their fitness level under environmental changes (for example, in the lifespan) and degrade gracefully when their genome is perturbed. At the same time, initially subcritical agents, even when evolved to the same fitness, are often inadequate to withstand the changes in the lifespan and degrade catastrophically with genetic perturbations. Furthermore, we find the optimal distance to criticality depends on the task complexity. To test it we introduce a hard and simple task: for the hard task, agents evolve closer to criticality whereas more subcritical solutions are found for the simple task. We verify that our results are independent of the selected evolutionary mechanisms by testing them on two principally different approaches: a genetic algorithm and an evolutionary strategy. In summary, our study suggests that although optimal behaviour in the simple task is obtained in a subcritical regime, initializing near criticality is important to be efficient at finding optimal solutions for new tasks of unknown complexity.
△ Less
Submitted 24 November, 2023; v1 submitted 28 March, 2023;
originally announced March 2023.
-
Assessing aesthetics of generated abstract images using correlation structure
Authors:
Sina Khajehabdollahi,
Georg Martius,
Anna Levina
Abstract:
Can we generate abstract aesthetic images without bias from natural or human selected image corpi? Are aesthetic images singled out in their correlation functions? In this paper we give answers to these and more questions. We generate images using compositional pattern-producing networks with random weights and varying architecture. We demonstrate that even with the randomly selected weights the c…
▽ More
Can we generate abstract aesthetic images without bias from natural or human selected image corpi? Are aesthetic images singled out in their correlation functions? In this paper we give answers to these and more questions. We generate images using compositional pattern-producing networks with random weights and varying architecture. We demonstrate that even with the randomly selected weights the correlation functions remain largely determined by the network architecture. In a controlled experiment, human subjects picked aesthetic images out of a large dataset of all generated images. Statistical analysis reveals that the correlation function is indeed different for aesthetic images.
△ Less
Submitted 18 May, 2021;
originally announced May 2021.
-
Weighted directed clustering: interpretations and requirements for heterogeneous, inferred, and measured networks
Authors:
Tanguy Fardet,
Anna Levina
Abstract:
Weights and directionality of the edges carry a large part of the information we can extract from a complex network. However, many network measures were formulated initially for undirected binary networks. The necessity to incorporate information about the weights led to the conception of the multiple extensions, particularly for definitions of the local clustering coefficient discussed here. We u…
▽ More
Weights and directionality of the edges carry a large part of the information we can extract from a complex network. However, many network measures were formulated initially for undirected binary networks. The necessity to incorporate information about the weights led to the conception of the multiple extensions, particularly for definitions of the local clustering coefficient discussed here. We uncover that not all of these extensions are fully-weighted; some depend on the degree and thus change a lot when an infinitely small weight edge is exchanged for the absence of an edge, a feature that is not always desirable. We call these methods ``hybrid'' and argue that, in many situations, one should prefer fully-weighted definitions. After listing the necessary requirements for a method to analyze many various weighted networks properly, we propose a fully-weighted continuous clustering coefficient that satisfies all the previously proposed criteria while also being continuous with respect to vanishing weights. We demonstrate that the behavior and meaning of the Zhang--Horvath clustering and our new continuous definition provide complementary results and significantly outperform other definitions in multiple relevant conditions. Using synthetic and real-world examples, we show that when the network is inferred, noisy, or very heterogeneous, it is essential to use the fully-weighted clustering definitions.
△ Less
Submitted 29 August, 2021; v1 submitted 13 May, 2021;
originally announced May 2021.
-
The dynamical regime and its importance for evolvability, task performance and generalization
Authors:
Jan Prosi,
Sina Khajehabdollahi,
Emmanouil Giannakakis,
Georg Martius,
Anna Levina
Abstract:
It has long been hypothesized that operating close to the critical state is beneficial for natural and artificial systems. We test this hypothesis by evolving foraging agents controlled by neural networks that can change the system's dynamical regime throughout evolution. Surprisingly, we find that all populations, regardless of their initial regime, evolve to be subcritical in simple tasks and ev…
▽ More
It has long been hypothesized that operating close to the critical state is beneficial for natural and artificial systems. We test this hypothesis by evolving foraging agents controlled by neural networks that can change the system's dynamical regime throughout evolution. Surprisingly, we find that all populations, regardless of their initial regime, evolve to be subcritical in simple tasks and even strongly subcritical populations can reach comparable performance. We hypothesize that the moderately subcritical regime combines the benefits of generalizability and adaptability brought by closeness to criticality with the stability of the dynamics characteristic for subcritical systems. By a resilience analysis, we find that initially critical agents maintain their fitness level even under environmental changes and degrade slowly with increasing perturbation strength. On the other hand, subcritical agents originally evolved to the same fitness, were often rendered utterly inadequate and degraded faster. We conclude that although the subcritical regime is preferable for a simple task, the optimal deviation from criticality depends on the task difficulty: for harder tasks, agents evolve closer to criticality. Furthermore, subcritical populations cannot find the path to decrease their distance to criticality. In summary, our study suggests that initializing models near criticality is important to find an optimal and flexible solution.
△ Less
Submitted 22 March, 2021;
originally announced March 2021.