Search | arXiv e-print repository

Backpropagation through space, time, and the brain

Authors: Benjamin Ellenberger, Paul Haider, Jakob Jordan, Kevin Max, Ismael Jaras, Laura Kriener, Federico Benitez, Mihai A. Petrovici

Abstract: How physical networks of neurons, bound by spatio-temporal locality constraints, can perform efficient credit assignment, remains, to a large extent, an open question. In machine learning, the answer is almost universally given by the error backpropagation algorithm, through both space and time. However, this algorithm is well-known to rely on biologically implausible assumptions, in particular wi… ▽ More How physical networks of neurons, bound by spatio-temporal locality constraints, can perform efficient credit assignment, remains, to a large extent, an open question. In machine learning, the answer is almost universally given by the error backpropagation algorithm, through both space and time. However, this algorithm is well-known to rely on biologically implausible assumptions, in particular with respect to spatio-temporal (non-)locality. Alternative forward-propagation models such as real-time recurrent learning only partially solve the locality problem, but only at the cost of scaling, due to prohibitive storage requirements. We introduce Generalized Latent Equilibrium (GLE), a computational framework for fully local spatio-temporal credit assignment in physical, dynamical networks of neurons. We start by defining an energy based on neuron-local mismatches, from which we derive both neuronal dynamics via stationarity and parameter dynamics via gradient descent. The resulting dynamics can be interpreted as a real-time, biologically plausible approximation of backpropagation through space and time in deep cortical networks with continuous-time neuronal dynamics and continuously active, local synaptic plasticity. In particular, GLE exploits the morphology of dendritic trees to enable more complex information storage and processing in single neurons, as well as the ability of biological neurons to phase-shift their output rate with respect to their membrane potential, which is essential in both directions of information propagation. For the forward computation, it enables the mapping of time-continuous inputs to neuronal space, effectively performing a spatio-temporal convolution. For the backward computation, it permits the temporal inversion of feedback signals, which consequently approximate the adjoint variables necessary for useful parameter updates. △ Less

Submitted 5 May, 2025; v1 submitted 25 March, 2024; originally announced March 2024.

Comments: First authorship shared by Benjamin Ellenberger and Paul Haider

arXiv:2402.16763 [pdf, other]

ELiSe: Efficient Learning of Sequences in Structured Recurrent Networks

Authors: Laura Kriener, Kristin Völk, Ben von Hünerbein, Federico Benitez, Walter Senn, Mihai A. Petrovici

Abstract: Behavior can be described as a temporal sequence of actions driven by neural activity. To learn complex sequential patterns in neural networks, memories of past activities need to persist on significantly longer timescales than the relaxation times of single-neuron activity. While recurrent networks can produce such long transients, training these networks is a challenge. Learning via error propag… ▽ More Behavior can be described as a temporal sequence of actions driven by neural activity. To learn complex sequential patterns in neural networks, memories of past activities need to persist on significantly longer timescales than the relaxation times of single-neuron activity. While recurrent networks can produce such long transients, training these networks is a challenge. Learning via error propagation confers models such as FORCE, RTRL or BPTT a significant functional advantage, but at the expense of biological plausibility. While reservoir computing circumvents this issue by learning only the readout weights, it does not scale well with problem complexity. We propose that two prominent structural features of cortical networks can alleviate these issues: the presence of a certain network scaffold at the onset of learning and the existence of dendritic compartments for enhancing neuronal information storage and computation. Our resulting model for Efficient Learning of Sequences (ELiSe) builds on these features to acquire and replay complex non-Markovian spatio-temporal patterns using only local, always-on and phase-free synaptic plasticity. We showcase the capabilities of ELiSe in a mock-up of birdsong learning, and demonstrate its flexibility with respect to parametrization, as well as its robustness to external disturbances. △ Less

Submitted 27 September, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

Comments: 15 pages, 7 figures, 1 table

arXiv:2309.10823 [pdf, other]

Gradient-based methods for spiking physical systems

Authors: Julian Göltz, Sebastian Billaudelle, Laura Kriener, Luca Blessing, Christian Pehle, Eric Müller, Johannes Schemmel, Mihai A. Petrovici

Abstract: Recent efforts have fostered significant progress towards deep learning in spiking networks, both theoretical and in silico. Here, we discuss several different approaches, including a tentative comparison of the results on BrainScaleS-2, and hint towards future such comparative studies. Recent efforts have fostered significant progress towards deep learning in spiking networks, both theoretical and in silico. Here, we discuss several different approaches, including a tentative comparison of the results on BrainScaleS-2, and hint towards future such comparative studies. △ Less

Submitted 29 August, 2023; originally announced September 2023.

Comments: 2 page abstract, submitted to and accepted by the NNPC (International conference on neuromorphic, natural and physical computing)

arXiv:2212.10249 [pdf, other]

Learning efficient backprojections across cortical hierarchies in real time

Authors: Kevin Max, Laura Kriener, Garibaldi Pineda García, Thomas Nowotny, Ismael Jaras, Walter Senn, Mihai A. Petrovici

Abstract: Models of sensory processing and learning in the cortex need to efficiently assign credit to synapses in all areas. In deep learning, a known solution is error backpropagation, which however requires biologically implausible weight transport from feed-forward to feedback paths. We introduce Phaseless Alignment Learning (PAL), a bio-plausible method to learn efficient feedback weights in layered… ▽ More Models of sensory processing and learning in the cortex need to efficiently assign credit to synapses in all areas. In deep learning, a known solution is error backpropagation, which however requires biologically implausible weight transport from feed-forward to feedback paths. We introduce Phaseless Alignment Learning (PAL), a bio-plausible method to learn efficient feedback weights in layered cortical hierarchies. This is achieved by exploiting the noise naturally found in biophysical systems as an additional carrier of information. In our dynamical system, all weights are learned simultaneously with always-on plasticity and using only information locally available to the synapses. Our method is completely phase-free (no forward and backward passes or phased learning) and allows for efficient error propagation across multi-layer cortical hierarchies, while maintaining biologically plausible signal transport and learning. Our method is applicable to a wide class of models and improves on previously known biologically plausible ways of credit assignment: compared to random synaptic feedback, it can solve complex tasks with less neurons and learn more useful latent representations. We demonstrate this on various classification tasks using a cortical microcircuit model with prospective coding. △ Less

Submitted 2 February, 2024; v1 submitted 20 December, 2022; originally announced December 2022.

Comments: Updated with streamlined main part, CIFAR-10 simulations, including DFA and minor fixes

arXiv:2110.14549 [pdf, other]

Latent Equilibrium: A unified learning theory for arbitrarily fast computation with arbitrarily slow neurons

Authors: Paul Haider, Benjamin Ellenberger, Laura Kriener, Jakob Jordan, Walter Senn, Mihai A. Petrovici

Abstract: The response time of physical computational elements is finite, and neurons are no exception. In hierarchical models of cortical networks each layer thus introduces a response lag. This inherent property of physical dynamical systems results in delayed processing of stimuli and causes a timing mismatch between network output and instructive signals, thus afflicting not only inference, but also lea… ▽ More The response time of physical computational elements is finite, and neurons are no exception. In hierarchical models of cortical networks each layer thus introduces a response lag. This inherent property of physical dynamical systems results in delayed processing of stimuli and causes a timing mismatch between network output and instructive signals, thus afflicting not only inference, but also learning. We introduce Latent Equilibrium, a new framework for inference and learning in networks of slow components which avoids these issues by harnessing the ability of biological neurons to phase-advance their output with respect to their membrane potential. This principle enables quasi-instantaneous inference independent of network depth and avoids the need for phased plasticity or computationally expensive network relaxation phases. We jointly derive disentangled neuron and synapse dynamics from a prospective energy function that depends on a network's generalized position and momentum. The resulting model can be interpreted as a biologically plausible approximation of error backpropagation in deep cortical networks with continuous-time, leaky neuronal dynamics and continuously active, local plasticity. We demonstrate successful learning of standard benchmark datasets, achieving competitive performance using both fully-connected and convolutional architectures, and show how our principle can be applied to detailed models of cortical microcircuitry. Furthermore, we study the robustness of our model to spatio-temporal substrate imperfections to demonstrate its feasibility for physical realization, be it in vivo or in silico. △ Less

Submitted 27 October, 2021; originally announced October 2021.

Comments: Accepted for publication in Advances in Neural Information Processing Systems 34 (NeurIPS 2021); 13 pages, 4 figures; 10 pages of supplementary material, 1 supplementary figure

ACM Class: F.1.1; I.2.6; I.5.1; B.8.1

arXiv:2102.08211 [pdf, other]

The Yin-Yang dataset

Authors: Laura Kriener, Julian Göltz, Mihai A. Petrovici

Abstract: The Yin-Yang dataset was developed for research on biologically plausible error backpropagation and deep learning in spiking neural networks. It serves as an alternative to classic deep learning datasets, especially in early-stage prototyping scenarios for both network models and hardware platforms, for which it provides several advantages. First, it is smaller and therefore faster to learn, there… ▽ More The Yin-Yang dataset was developed for research on biologically plausible error backpropagation and deep learning in spiking neural networks. It serves as an alternative to classic deep learning datasets, especially in early-stage prototyping scenarios for both network models and hardware platforms, for which it provides several advantages. First, it is smaller and therefore faster to learn, thereby being better suited for small-scale exploratory studies in both software simulations and hardware prototypes. Second, it exhibits a very clear gap between the accuracies achievable using shallow as compared to deep neural networks. Third, it is easily transferable between spatial and temporal input domains, making it interesting for different types of classification scenarios. △ Less

Submitted 4 January, 2022; v1 submitted 16 February, 2021; originally announced February 2021.

Comments: 6 pages, 4 figures, 2 tables

arXiv:1912.11443 [pdf, other]

doi 10.1038/s42256-021-00388-x

Fast and energy-efficient neuromorphic deep learning with first-spike times

Authors: Julian Göltz, Laura Kriener, Andreas Baumbach, Sebastian Billaudelle, Oliver Breitwieser, Benjamin Cramer, Dominik Dold, Akos Ferenc Kungl, Walter Senn, Johannes Schemmel, Karlheinz Meier, Mihai Alexandru Petrovici

Abstract: For a biological agent operating under environmental pressure, energy consumption and reaction times are of critical importance. Similarly, engineered systems are optimized for short time-to-solution and low energy-to-solution characteristics. At the level of neuronal implementation, this implies achieving the desired results with as few and as early spikes as possible. With time-to-first-spike co… ▽ More For a biological agent operating under environmental pressure, energy consumption and reaction times are of critical importance. Similarly, engineered systems are optimized for short time-to-solution and low energy-to-solution characteristics. At the level of neuronal implementation, this implies achieving the desired results with as few and as early spikes as possible. With time-to-first-spike coding both of these goals are inherently emerging features of learning. Here, we describe a rigorous derivation of a learning rule for such first-spike times in networks of leaky integrate-and-fire neurons, relying solely on input and output spike times, and show how this mechanism can implement error backpropagation in hierarchical spiking networks. Furthermore, we emulate our framework on the BrainScaleS-2 neuromorphic system and demonstrate its capability of harnessing the system's speed and energy characteristics. Finally, we examine how our approach generalizes to other neuromorphic platforms by studying how its performance is affected by typical distortive effects induced by neuromorphic substrates. △ Less

Submitted 17 May, 2021; v1 submitted 24 December, 2019; originally announced December 2019.

Comments: 24 pages, 11 figures

Journal ref: Nature Machine Intelligence 3, 823-835 (2021)

arXiv:1804.01840 [pdf, other]

doi 10.1109/TBCAS.2018.2848203

A Mixed-Signal Structured AdEx Neuron for Accelerated Neuromorphic Cores

Authors: Syed Ahmed Aamir, Paul Müller, Gerd Kiene, Laura Kriener, Yannik Stradmann, Andreas Grübl, Johannes Schemmel, Karlheinz Meier

Abstract: Here we describe a multi-compartment neuron circuit based on the Adaptive-Exponential I&F (AdEx) model, developed for the second-generation BrainScaleS hardware. Based on an existing modular Leaky Integrate-and-Fire (LIF) architecture designed in 65 nm CMOS, the circuit features exponential spike generation, neuronal adaptation, inter-compartmental connections as well as a conductance-based reset.… ▽ More Here we describe a multi-compartment neuron circuit based on the Adaptive-Exponential I&F (AdEx) model, developed for the second-generation BrainScaleS hardware. Based on an existing modular Leaky Integrate-and-Fire (LIF) architecture designed in 65 nm CMOS, the circuit features exponential spike generation, neuronal adaptation, inter-compartmental connections as well as a conductance-based reset. The design reproduces a diverse set of firing patterns observed in cortical pyramidal neurons. Further, it enables the emulation of sodium and calcium spikes, as well as N-Methyl-D-Aspartate (NMDA) plateau potentials known from apical and thin dendrites. We characterize the AdEx circuit extensions and exemplify how the interplay between passive and non-linear active signal processing enhances the computational capabilities of single (but structured) on-chip neurons. △ Less

Submitted 29 May, 2018; v1 submitted 5 April, 2018; originally announced April 2018.

Comments: 11 pages, 17 figures (including author photographs)

Showing 1–8 of 8 results for author: Kriener, L