Search | arXiv e-print repository

Inverting Transformer-based Vision Models

Authors: Jan Rathjens, Shirin Reyhanian, David Kappel, Laurenz Wiskott

Abstract: Understanding the mechanisms underlying deep neural networks in computer vision remains a fundamental challenge. While many previous approaches have focused on visualizing intermediate representations within deep neural networks, particularly convolutional neural networks, these techniques have yet to be thoroughly explored in transformer-based vision models. In this study, we apply a modular appr… ▽ More Understanding the mechanisms underlying deep neural networks in computer vision remains a fundamental challenge. While many previous approaches have focused on visualizing intermediate representations within deep neural networks, particularly convolutional neural networks, these techniques have yet to be thoroughly explored in transformer-based vision models. In this study, we apply a modular approach of training inverse models to reconstruct input images from intermediate layers within a Detection Transformer and a Vision Transformer, showing that this approach is efficient and feasible. Through qualitative and quantitative evaluations of reconstructed images, we generate insights into the underlying mechanisms of these architectures, highlighting their similarities and differences in terms of contextual shape and preservation of image details, inter-layer correlation, and robustness to color perturbations. Our analysis illustrates how these properties emerge within the models, contributing to a deeper understanding of transformer-based vision models. The code for reproducing our experiments is available at github.com/wiskott-lab/inverse-tvm. △ Less

Submitted 25 March, 2025; v1 submitted 9 December, 2024; originally announced December 2024.

arXiv:2411.12603 [pdf, other]

STREAM: A Universal State-Space Model for Sparse Geometric Data

Authors: Mark Schöne, Yash Bhisikar, Karan Bania, Khaleelulla Khan Nazeer, Christian Mayr, Anand Subramoney, David Kappel

Abstract: Handling sparse and unstructured geometric data, such as point clouds or event-based vision, is a pressing challenge in the field of machine vision. Recently, sequence models such as Transformers and state-space models entered the domain of geometric data. These methods require specialized preprocessing to create a sequential view of a set of points. Furthermore, prior works involving sequence mod… ▽ More Handling sparse and unstructured geometric data, such as point clouds or event-based vision, is a pressing challenge in the field of machine vision. Recently, sequence models such as Transformers and state-space models entered the domain of geometric data. These methods require specialized preprocessing to create a sequential view of a set of points. Furthermore, prior works involving sequence models iterate geometric data with either uniform or learned step sizes, implicitly relying on the model to infer the underlying geometric structure. In this work, we propose to encode geometric structure explicitly into the parameterization of a state-space model. State-space models are based on linear dynamics governed by a one-dimensional variable such as time or a spatial coordinate. We exploit this dynamic variable to inject relative differences of coordinates into the step size of the state-space model. The resulting geometric operation computes interactions between all pairs of N points in O(N) steps. Our model deploys the Mamba selective state-space model with a modified CUDA kernel to efficiently map sparse geometric data to modern hardware. The resulting sequence model, which we call STREAM, achieves competitive results on a range of benchmarks from point-cloud classification to event-based vision and audio classification. STREAM demonstrates a powerful inductive bias for sparse geometric data by improving the PointMamba baseline when trained from scratch on the ModelNet40 and ScanObjectNN point cloud analysis datasets. It further achieves, for the first time, 100% test accuracy on all 11 classes of the DVS128 Gestures dataset. △ Less

Submitted 22 November, 2024; v1 submitted 19 November, 2024; originally announced November 2024.

arXiv:2410.11687 [pdf, other]

State-space models can learn in-context by gradient descent

Authors: Neeraj Mohan Sushma, Yudou Tian, Harshvardhan Mestha, Nicolo Colombo, David Kappel, Anand Subramoney

Abstract: Deep state-space models (Deep SSMs) are becoming popular as effective approaches to model sequence data. They have also been shown to be capable of in-context learning, much like transformers. However, a complete picture of how SSMs might be able to do in-context learning has been missing. In this study, we provide a direct and explicit construction to show that state-space models can perform grad… ▽ More Deep state-space models (Deep SSMs) are becoming popular as effective approaches to model sequence data. They have also been shown to be capable of in-context learning, much like transformers. However, a complete picture of how SSMs might be able to do in-context learning has been missing. In this study, we provide a direct and explicit construction to show that state-space models can perform gradient-based learning and use it for in-context learning in much the same way as transformers. Specifically, we prove that a single structured state-space model layer, augmented with multiplicative input and output gating, can reproduce the outputs of an implicit linear model with least squares loss after one step of gradient descent. We then show a straightforward extension to multi-step linear and non-linear regression tasks. We validate our construction by training randomly initialized augmented SSMs on linear and non-linear regression tasks. The empirically obtained parameters through optimization match the ones predicted analytically by the theoretical construction. Overall, we elucidate the role of input- and output-gating in recurrent architectures as the key inductive biases for enabling the expressive power typical of foundation models. We also provide novel insights into the relationship between state-space models and linear self-attention, and their ability to learn in-context. △ Less

Submitted 18 February, 2025; v1 submitted 15 October, 2024; originally announced October 2024.

Comments: 20 pages, 6 figures

arXiv:2410.05985 [pdf, other]

Asynchronous Stochastic Gradient Descent with Decoupled Backpropagation and Layer-Wise Updates

Authors: Cabrel Teguemne Fokam, Khaleelulla Khan Nazeer, Lukas König, David Kappel, Anand Subramoney

Abstract: The increasing size of deep learning models has made distributed training across multiple devices essential. However, current methods such as distributed data-parallel training suffer from large communication and synchronization overheads when training across devices, leading to longer training times as a result of suboptimal hardware utilization. Asynchronous stochastic gradient descent (ASGD) me… ▽ More The increasing size of deep learning models has made distributed training across multiple devices essential. However, current methods such as distributed data-parallel training suffer from large communication and synchronization overheads when training across devices, leading to longer training times as a result of suboptimal hardware utilization. Asynchronous stochastic gradient descent (ASGD) methods can improve training speed, but are sensitive to delays due to both communication and differences throughput. Moreover, the backpropagation algorithm used within ASGD workers is bottlenecked by the interlocking between its forward and backward passes. Current methods also do not take advantage of the large differences in the computation required for the forward and backward passes. Therefore, we propose an extension to ASGD called Partial Decoupled ASGD (PD-ASGD) that addresses these issues. PD-ASGD uses separate threads for the forward and backward passes, decoupling the updates and allowing for a higher ratio of forward to backward threads than the usual 1:1 ratio, leading to higher throughput. PD-ASGD also performs layer-wise (partial) model updates concurrently across multiple threads. This reduces parameter staleness and consequently improves robustness to delays. Our approach yields close to state-of-the-art results while running up to $5.95\times$ faster than synchronous data parallelism in the presence of delays, and up to $2.14\times$ times faster than comparable ASGD algorithms by achieving higher model flops utilization. We mathematically describe the gradient bias introduced by our method, establish an upper bound, and prove convergence. △ Less

Submitted 7 February, 2025; v1 submitted 8 October, 2024; originally announced October 2024.

Comments: 17 pages, 5 figures

MSC Class: G.1.6 ACM Class: I.2.6; I.5.1

arXiv:2405.00433 [pdf, other]

Weight Sparsity Complements Activity Sparsity in Neuromorphic Language Models

Authors: Rishav Mukherji, Mark Schöne, Khaleelulla Khan Nazeer, Christian Mayr, David Kappel, Anand Subramoney

Abstract: Activity and parameter sparsity are two standard methods of making neural networks computationally more efficient. Event-based architectures such as spiking neural networks (SNNs) naturally exhibit activity sparsity, and many methods exist to sparsify their connectivity by pruning weights. While the effect of weight pruning on feed-forward SNNs has been previously studied for computer vision tasks… ▽ More Activity and parameter sparsity are two standard methods of making neural networks computationally more efficient. Event-based architectures such as spiking neural networks (SNNs) naturally exhibit activity sparsity, and many methods exist to sparsify their connectivity by pruning weights. While the effect of weight pruning on feed-forward SNNs has been previously studied for computer vision tasks, the effects of pruning for complex sequence tasks like language modeling are less well studied since SNNs have traditionally struggled to achieve meaningful performance on these tasks. Using a recently published SNN-like architecture that works well on small-scale language modeling, we study the effects of weight pruning when combined with activity sparsity. Specifically, we study the trade-off between the multiplicative efficiency gains the combination affords and its effect on task performance for language modeling. To dissect the effects of the two sparsities, we conduct a comparative analysis between densely activated models and sparsely activated event-based models across varying degrees of connectivity sparsity. We demonstrate that sparse activity and sparse connectivity complement each other without a proportional drop in task performance for an event-based neural network trained on the Penn Treebank and WikiText-2 language modeling datasets. Our results suggest sparsely connected event-based neural networks are promising candidates for effective and efficient sequence modeling. △ Less

Submitted 1 May, 2024; originally announced May 2024.

Comments: arXiv admin note: text overlap with arXiv:2311.07625

arXiv:2404.18508 [pdf, other]

Scalable Event-by-event Processing of Neuromorphic Sensory Signals With Deep State-Space Models

Authors: Mark Schöne, Neeraj Mohan Sushma, Jingyue Zhuge, Christian Mayr, Anand Subramoney, David Kappel

Abstract: Event-based sensors are well suited for real-time processing due to their fast response times and encoding of the sensory data as successive temporal differences. These and other valuable properties, such as a high dynamic range, are suppressed when the data is converted to a frame-based format. However, most current methods either collapse events into frames or cannot scale up when processing the… ▽ More Event-based sensors are well suited for real-time processing due to their fast response times and encoding of the sensory data as successive temporal differences. These and other valuable properties, such as a high dynamic range, are suppressed when the data is converted to a frame-based format. However, most current methods either collapse events into frames or cannot scale up when processing the event data directly event-by-event. In this work, we address the key challenges of scaling up event-by-event modeling of the long event streams emitted by such sensors, which is a particularly relevant problem for neuromorphic computing. While prior methods can process up to a few thousand time steps, our model, based on modern recurrent deep state-space models, scales to event streams of millions of events for both training and inference. We leverage their stable parameterization for learning long-range dependencies, parallelizability along the sequence dimension, and their ability to integrate asynchronous events effectively to scale them up to long event streams. We further augment these with novel event-centric techniques enabling our model to match or beat the state-of-the-art performance on several event stream benchmarks. In the Spiking Speech Commands task, we improve state-of-the-art by a large margin of 7.7% to 88.4%. On the DVS128-Gestures dataset, we achieve competitive results without using frames or convolutional neural networks. Our work demonstrates, for the first time, that it is possible to use fully event-based processing with purely recurrent networks to achieve state-of-the-art task performance in several event-based benchmarks. △ Less

Submitted 9 October, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

arXiv:2402.02521 [pdf, other]

Neuromorphic hardware for sustainable AI data centers

Authors: Bernhard Vogginger, Amirhossein Rostami, Vaibhav Jain, Sirine Arfa, Andreas Hantsch, David Kappel, Michael Schäfer, Ulrike Faltings, Hector A. Gonzalez, Chen Liu, Christian Mayr, Wolfgang Maaß

Abstract: As humans advance toward a higher level of artificial intelligence, it is always at the cost of escalating computational resource consumption, which requires developing novel solutions to meet the exponential growth of AI computing demand. Neuromorphic hardware takes inspiration from how the brain processes information and promises energy-efficient computing of AI workloads. Despite its potential,… ▽ More As humans advance toward a higher level of artificial intelligence, it is always at the cost of escalating computational resource consumption, which requires developing novel solutions to meet the exponential growth of AI computing demand. Neuromorphic hardware takes inspiration from how the brain processes information and promises energy-efficient computing of AI workloads. Despite its potential, neuromorphic hardware has not found its way into commercial AI data centers. In this article, we try to analyze the underlying reasons for this and derive requirements and guidelines to promote neuromorphic systems for efficient and sustainable cloud computing: We first review currently available neuromorphic hardware systems and collect examples where neuromorphic solutions excel conventional AI processing on CPUs and GPUs. Next, we identify applications, models and algorithms which are commonly deployed in AI data centers as further directions for neuromorphic algorithms research. Last, we derive requirements and best practices for the hardware and software integration of neuromorphic systems into data centers. With this article, we hope to increase awareness of the challenges of integrating neuromorphic hardware into data centers and to guide the community to enable sustainable and energy-efficient AI at scale. △ Less

Submitted 26 June, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

Comments: 11 pages, 2 figures, presented as poster at NICE 2024, 2nd version with updated author list and minor updates

arXiv:2312.09084 [pdf, other]

Language Modeling on a SpiNNaker 2 Neuromorphic Chip

Authors: Khaleelulla Khan Nazeer, Mark Schöne, Rishav Mukherji, Bernhard Vogginger, Christian Mayr, David Kappel, Anand Subramoney

Abstract: As large language models continue to scale in size rapidly, so too does the computational power required to run them. Event-based networks on neuromorphic devices offer a potential way to reduce energy consumption for inference significantly. However, to date, most event-based networks that can run on neuromorphic hardware, including spiking neural networks (SNNs), have not achieved task performan… ▽ More As large language models continue to scale in size rapidly, so too does the computational power required to run them. Event-based networks on neuromorphic devices offer a potential way to reduce energy consumption for inference significantly. However, to date, most event-based networks that can run on neuromorphic hardware, including spiking neural networks (SNNs), have not achieved task performance even on par with LSTM models for language modeling. As a result, language modeling on neuromorphic devices has seemed a distant prospect. In this work, we demonstrate the first-ever implementation of a language model on a neuromorphic device - specifically the SpiNNaker 2 chip - based on a recently published event-based architecture called the EGRU. SpiNNaker 2 is a many-core neuromorphic chip designed for large-scale asynchronous processing, while the EGRU is architected to leverage such hardware efficiently while maintaining competitive task performance. This implementation marks the first time a neuromorphic language model matches LSTMs, setting the stage for taking task performance to the level of large language models. We also demonstrate results on a gesture recognition task based on inputs from a DVS camera. Overall, our results showcase the feasibility of this neuro-inspired neural network in hardware, highlighting significant gains versus conventional hardware in energy efficiency for the common use case of single batch inference. △ Less

Submitted 24 January, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

arXiv:2311.11712 [pdf]

doi 10.1007/978-3-031-29003-9_31

Identification of Ammonium Salts on Comet 67P/C-G Surface from Infrared VIRTIS/Rosetta Data Based on Laboratory Experiments. Implications and Perspectives

Authors: Olivier Poch, Istiqomah Istiqomah, Eric Quirico, Pierre Beck, Bernard Schmitt, Patrice Theulé, Alexandre Faure, Pierre Hily-Blant, Lydie Bonal, Andrea Raponi, Mauro Ciarniello, Batiste Rousseau, Sandra Potin, Olivier Brissaud, Laurène Flandinet, Gianrico Filacchione, Antoine Pommerol, Nicolas Thomas, David Kappel, Vito Mennella, Lyuba Moroz, Vassilissa Vinogradoff, Gabriele Arnold, Stéphane Erard, Dominique Bockelée-Morvan , et al. (7 additional authors not shown)

Abstract: The nucleus of comet 67P/Churyumov-Gerasimenko exhibits a broad spectral reflectance feature around 3.2 $μ$m, which is omnipresent in all spectra of the surface, and whose attribution has remained elusive since its discovery. Based on laboratory experiments, we have shown that most of this absorption feature is due to ammonium (NH4+) salts mixed with the dark surface material. The depth of the ban… ▽ More The nucleus of comet 67P/Churyumov-Gerasimenko exhibits a broad spectral reflectance feature around 3.2 $μ$m, which is omnipresent in all spectra of the surface, and whose attribution has remained elusive since its discovery. Based on laboratory experiments, we have shown that most of this absorption feature is due to ammonium (NH4+) salts mixed with the dark surface material. The depth of the band is compatible with semi-volatile ammonium salts being a major reservoir of nitrogen in the comet, which could dominate over refractory organic matter and volatile species. These salts may thus represent the long-sought reservoir of nitrogen in comets, possibly bringing their nitrogen-to-carbon ratio in agreement with the solar value. Moreover, the reflectance spectra of several asteroids are compatible with the presence of NH4+ salts at their surfaces. The presence of such salts, and other NH4+-bearing compounds on asteroids, comets, and possibly in proto-stellar environments, suggests that NH4+ may be a tracer of the incorporation and transformation of nitrogen in ices, minerals and organics, at different phases of the formation of the Solar System. △ Less

Submitted 20 November, 2023; originally announced November 2023.

Journal ref: European Conference on Laboratory Astrophysics ECLA2020, Sep 2021, Anacapri, Italy. pp.271 - 279

arXiv:2306.14691 [pdf, other]

doi 10.1088/2634-4386/ad01d6

Tunable Synaptic Working Memory with Volatile Memristive Devices

Authors: Saverio Ricci, David Kappel, Christian Tetzlaff, Daniele Ielmini, Erika Covi

Abstract: Different real-world cognitive tasks evolve on different relevant timescales. Processing these tasks requires memory mechanisms able to match their specific time constants. In particular, the working memory utilizes mechanisms that span orders of magnitudes of timescales, from milliseconds to seconds or even minutes. This plentitude of timescales is an essential ingredient of working memory tasks… ▽ More Different real-world cognitive tasks evolve on different relevant timescales. Processing these tasks requires memory mechanisms able to match their specific time constants. In particular, the working memory utilizes mechanisms that span orders of magnitudes of timescales, from milliseconds to seconds or even minutes. This plentitude of timescales is an essential ingredient of working memory tasks like visual or language processing. This degree of flexibility is challenging in analog computing hardware because it requires the integration of several reconfigurable capacitors of different size. Emerging volatile memristive devices present a compact and appealing solution to reproduce reconfigurable temporal dynamics in a neuromorphic network. We present a demonstration of working memory using a silver-based memristive device whose key parameters, retention time and switching probability, can be electrically tuned and adapted to the task at hand. First, we demonstrate the principles of working memory in a small scale hardware to execute an associative memory task. Then, we use the experimental data in two larger scale simulations, the first featuring working memory in a biological environment, the second demonstrating associative symbolic working memory. △ Less

Submitted 26 June, 2023; originally announced June 2023.

Journal ref: Neuromorphic Computing and Engineering 2023

arXiv:2305.14974 [pdf, other]

Block-local learning with probabilistic latent representations

Authors: David Kappel, Khaleelulla Khan Nazeer, Cabrel Teguemne Fokam, Christian Mayr, Anand Subramoney

Abstract: The ubiquitous backpropagation algorithm requires sequential updates through the network introducing a locking problem. In addition, back-propagation relies on the transpose of forward weight matrices to compute updates, introducing a weight transport problem across the network. Locking and weight transport are problems because they prevent efficient parallelization and horizontal scaling of the t… ▽ More The ubiquitous backpropagation algorithm requires sequential updates through the network introducing a locking problem. In addition, back-propagation relies on the transpose of forward weight matrices to compute updates, introducing a weight transport problem across the network. Locking and weight transport are problems because they prevent efficient parallelization and horizontal scaling of the training process. We propose a new method to address both these problems and scale up the training of large models. Our method works by dividing a deep neural network into blocks and introduces a feedback network that propagates the information from the targets backwards to provide auxiliary local losses. Forward and backward propagation can operate in parallel and with different sets of weights, addressing the problems of locking and weight transport. Our approach derives from a statistical interpretation of training that treats output activations of network blocks as parameters of probability distributions. The resulting learning framework uses these parameters to evaluate the agreement between forward and backward information. Error backpropagation is then performed locally within each block, leading to "block-local" learning. Several previously proposed alternatives to error backpropagation emerge as special cases of our model. We present results on a variety of tasks and architectures, demonstrating state-of-the-art performance using block-local learning. These results provide a new principled framework for training networks in a distributed setting. △ Less

Submitted 27 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: 23 pages, 4 figures, preprint

arXiv:2211.03081 [pdf]

doi 10.1109/ICECS202256217.2022.9971100

Decision Making by a Neuromorphic Network of Volatile Resistive Switching Memories

Authors: Saverio Ricci, David Kappel, Christian Tetzlaff, Daniele Ielmini, Erika Covi

Abstract: The necessity of having an electronic device working in relevant biological time scales with a small footprint boosted the research of a new class of emerging memories. Ag-based volatile resistive switching memories (RRAMs) feature a spontaneous change of device conductance with a similarity to biological mechanisms. They rely on the formation and self-disruption of a metallic conductive filament… ▽ More The necessity of having an electronic device working in relevant biological time scales with a small footprint boosted the research of a new class of emerging memories. Ag-based volatile resistive switching memories (RRAMs) feature a spontaneous change of device conductance with a similarity to biological mechanisms. They rely on the formation and self-disruption of a metallic conductive filament through an oxide layer, with a retention time ranging from a few milliseconds to several seconds, greatly tunable according to the maximum current which is flowing through the device. Here we prove a neuromorphic system based on volatile-RRAMs able to mimic the principles of biological decision-making behavior and tackle the Two-Alternative Forced Choice problem, where a subject is asked to make a choice between two possible alternatives not relying on a precise knowledge of the problem, rather on noisy perceptions. △ Less

Submitted 6 November, 2022; originally announced November 2022.

Journal ref: 2022 29th IEEE International Conference on Electronics, Circuits and Systems (ICECS)

arXiv:2206.06178 [pdf, other]

Efficient recurrent architectures through activity sparsity and sparse back-propagation through time

Authors: Anand Subramoney, Khaleelulla Khan Nazeer, Mark Schöne, Christian Mayr, David Kappel

Abstract: Recurrent neural networks (RNNs) are well suited for solving sequence tasks in resource-constrained systems due to their expressivity and low computational requirements. However, there is still a need to bridge the gap between what RNNs are capable of in terms of efficiency and performance and real-world application requirements. The memory and computational requirements arising from propagating t… ▽ More Recurrent neural networks (RNNs) are well suited for solving sequence tasks in resource-constrained systems due to their expressivity and low computational requirements. However, there is still a need to bridge the gap between what RNNs are capable of in terms of efficiency and performance and real-world application requirements. The memory and computational requirements arising from propagating the activations of all the neurons at every time step to every connected neuron, together with the sequential dependence of activations, contribute to the inefficiency of training and using RNNs. We propose a solution inspired by biological neuron dynamics that makes the communication between RNN units sparse and discrete. This makes the backward pass with backpropagation through time (BPTT) computationally sparse and efficient as well. We base our model on the gated recurrent unit (GRU), extending it with units that emit discrete events for communication triggered by a threshold so that no information is communicated to other units in the absence of events. We show theoretically that the communication between units, and hence the computation required for both the forward and backward passes, scales with the number of events in the network. Our model achieves efficiency without compromising task performance, demonstrating competitive performance compared to state-of-the-art recurrent network models in real-world tasks, including language modeling. The dynamic activity sparsity mechanism also makes our model well suited for novel energy-efficient neuromorphic hardware. Code is available at https://github.com/KhaleelKhan/EvNN/. △ Less

Submitted 9 March, 2023; v1 submitted 13 June, 2022; originally announced June 2022.

Comments: Published as notable-top-25% paper in ICLR 2023

arXiv:2103.12649 [pdf, other]

A synapse-centric account of the free energy principle

Authors: David Kappel, Christian Tetzlaff

Abstract: The free energy principle (FEP) is a mathematical framework that describes how biological systems self-organize and survive in their environment. This principle provides insights on multiple scales, from high-level behavioral and cognitive functions such as attention or foraging, down to the dynamics of specialized cortical microcircuits, suggesting that the FEP manifests on several levels of brai… ▽ More The free energy principle (FEP) is a mathematical framework that describes how biological systems self-organize and survive in their environment. This principle provides insights on multiple scales, from high-level behavioral and cognitive functions such as attention or foraging, down to the dynamics of specialized cortical microcircuits, suggesting that the FEP manifests on several levels of brain function. Here, we apply the FEP to one of the smallest functional units of the brain: single excitatory synaptic connections. By focusing on an experimentally well understood biological system we are able to derive learning rules from first principles while keeping assumptions minimal. This synapse-centric account of the FEP predicts that synapses interact with the soma of the post-synaptic neuron through stochastic synaptic releases to probe their behavior and use back-propagating action potentials as feedback to update the synaptic weights. The emergent learning rules are regulated triplet STDP rules that depend only on the timing of the pre- and post-synaptic spikes and the internal states of the synapse. The parameters of the learning rules are fully determined by the parameters of the post-synaptic neuron model, suggesting a close interplay between the synaptic and somatic compartment and making precise predictions about the synaptic dynamics. The synapse-level uncertainties automatically lead to representations of uncertainty on the network level that manifest in ambiguous situations. We show that the FEP learning rules can be applied to spiking neural networks for supervised and unsupervised learning and for a closed loop learning task where a behaving agent interacts with an environment. △ Less

Submitted 23 March, 2021; originally announced March 2021.

arXiv:2012.14937 [pdf, other]

doi 10.3389/fnins.2021.611300

Adaptive Extreme Edge Computing for Wearable Devices

Authors: Erika Covi, Elisa Donati, Hadi Heidari, David Kappel, Xiangpeng Liang, Melika Payvand, Wei Wang

Abstract: Wearable devices are a fast-growing technology with impact on personal healthcare for both society and economy. Due to the widespread of sensors in pervasive and distributed networks, power consumption, processing speed, and system adaptation are vital in future smart wearable devices. The visioning and forecasting of how to bring computation to the edge in smart sensors have already begun, with a… ▽ More Wearable devices are a fast-growing technology with impact on personal healthcare for both society and economy. Due to the widespread of sensors in pervasive and distributed networks, power consumption, processing speed, and system adaptation are vital in future smart wearable devices. The visioning and forecasting of how to bring computation to the edge in smart sensors have already begun, with an aspiration to provide adaptive extreme edge computing. Here, we provide a holistic view of hardware and theoretical solutions towards smart wearable devices that can provide guidance to research in this pervasive computing era. We propose various solutions for biologically plausible models for continual learning in neuromorphic computing technologies for wearable sensors. To envision this concept, we provide a systematic outline in which prospective low power and low latency scenarios of wearable sensors in neuromorphic platforms are expected. We successively describe vital potential landscapes of neuromorphic processors exploiting complementary metal-oxide semiconductors (CMOS) and emerging memory technologies (e.g. memristive devices). Furthermore, we evaluate the requirements for edge computing within wearable devices in terms of footprint, power consumption, latency, and data size. We additionally investigate the challenges beyond neuromorphic computing hardware, algorithms and devices that could impede enhancement of adaptive edge computing in smart wearable devices. △ Less

Submitted 29 December, 2020; originally announced December 2020.

Comments: 29 pages, 4 figures

Journal ref: Frontiers in Neuroscience 2021

arXiv:2009.14476 [pdf]

doi 10.1038/s41550-019-0992-8

Infrared detection of aliphatic organics on a cometary nucleus

Authors: A. Raponi, M. Ciarniello, F. Capaccioni, V. Mennella, G. Filacchione, V. Vinogradoff, O. Poch, P. Beck, E. Quirico, M. C. De Sanctis, L. Moroz, D. Kappel, S. Erard, D. Bockelée-Morvan, A. Longobardo, F. Tosi, E. Palomba, J. -P. Combe, B. Rousseau, G. Arnold, R. W. Carlson, A. Pommerol, C. Pilorget, S. Fornasier, G. Bellucci , et al. (6 additional authors not shown)

Abstract: The ESA Rosetta mission has acquired unprecedented measurements of comet 67/P-Churyumov-Gerasimenko (hereafter 67P) nucleus surface, whose composition, as determined by in situ and remote sensing instruments including VIRTIS (Visible, InfraRed and Thermal Imaging Spectrometer) appears to be made by an assemblage of ices, minerals, and organic material. We performed a refined analysis of infrared o… ▽ More The ESA Rosetta mission has acquired unprecedented measurements of comet 67/P-Churyumov-Gerasimenko (hereafter 67P) nucleus surface, whose composition, as determined by in situ and remote sensing instruments including VIRTIS (Visible, InfraRed and Thermal Imaging Spectrometer) appears to be made by an assemblage of ices, minerals, and organic material. We performed a refined analysis of infrared observations of the nucleus of comet 67P carried out by the VIRTIS-M hyperspectral imager. We found that the overall shape of the 67P infrared spectrum is similar to that of other carbon-rich outer solar system objects suggesting a possible genetic link with them. More importantly, we are also able to confirm the complex spectral structure of the wide 2.8-3.6 micron absorption feature populated by fainter bands. Among these, we unambiguously identified the presence of aliphatic organics by their ubiquitous 3.38, 3.42 and 3.47 micron bands. This novel infrared detection of aliphatic species on a cometary surface has strong implications for the evolutionary history of the primordial solar system and give evidence that comets provide an evolutionary link between interstellar material and solar system bodies. △ Less

Submitted 30 September, 2020; originally announced September 2020.

arXiv:2005.02114 [pdf, other]

doi 10.3847/1538-4357/ab9084

Consistently Simulating a Wide Range of Atmospheric Scenarios for K2-18b with a Flexible Radiative Transfer Module

Authors: M. Scheucher, F. Wunderlich, J. L. Grenfell, M. Godolt, F. Schreier, D. Kappel, R. Haus, K. Herbst, H. Rauer

Abstract: The atmospheres of small, potentially rocky exoplanets are expected to cover a diverse range in composition and mass. Studying such objects therefore requires flexible and wide-ranging modeling capabilities. We present in this work the essential development steps that lead to our flexible radiative transfer module, REDFOX, and validate REDFOX for the Solar system planets Earth, Venus and Mars, as… ▽ More The atmospheres of small, potentially rocky exoplanets are expected to cover a diverse range in composition and mass. Studying such objects therefore requires flexible and wide-ranging modeling capabilities. We present in this work the essential development steps that lead to our flexible radiative transfer module, REDFOX, and validate REDFOX for the Solar system planets Earth, Venus and Mars, as well as for steam atmospheres. REDFOX is a k-distribution model using the correlated-k approach with random overlap method for the calculation of opacities used in the $δ$-two-stream approximation for radiative transfer. Opacity contributions from Rayleigh scattering, UV / visible cross sections and continua can be added selectively. With the improved capabilities of our new model, we calculate various atmospheric scenarios for K2-18b, a super-Earth / sub-Neptune with $\sim$8 M$_\oplus$ orbiting in the temperate zone around an M-star, with recently observed H$_2$O spectral features in the infrared. We model Earth-like, Venus-like, as well as H$_2$-He primary atmospheres of different Solar metallicity and show resulting climates and spectral characteristics, compared to observed data. Our results suggest that K2-18b has an H$_2$-He atmosphere with limited amounts of H$_2$O and CH$_4$. Results do not support the possibility of K2-18b having a water reservoir directly exposed to the atmosphere, which would reduce atmospheric scale heights, hence too the amplitudes of spectral features inconsistent with the observations. We also performed tests for H$_2$-He atmospheres up to 50 times Solar metallicity, all compatible with the observations. △ Less

Submitted 5 May, 2020; originally announced May 2020.

Comments: 28 pages, 13 figures, accepted for publication in ApJ

arXiv:2003.06034 [pdf]

doi 10.1126/science.aaw7462

Ammonium salts are a reservoir of nitrogen on a cometary nucleus and possibly on some asteroids

Authors: O. Poch, I. Istiqomah, E. Quirico, P. Beck, B. Schmitt, P. Theulé, A. Faure, P. Hily-Blant, L. Bonal, A. Raponi, M. Ciarniello, B. Rousseau, S. Potin, O. Brissaud, L. Flandinet, G. Filacchione, A. Pommerol, N. Thomas, D. Kappel, V. Mennella, L. Moroz, V. Vinogradoff, G. Arnold, S. Erard, D. Bockelée-Morvan , et al. (7 additional authors not shown)

Abstract: The measured nitrogen-to-carbon ratio in comets is lower than for the Sun, a discrepancy which could be alleviated if there is an unknown reservoir of nitrogen in comets. The nucleus of comet 67P/Churyumov-Gerasimenko exhibits an unidentified broad spectral reflectance feature around 3.2 micrometers, which is ubiquitous across its surface. On the basis of laboratory experiments, we attribute this… ▽ More The measured nitrogen-to-carbon ratio in comets is lower than for the Sun, a discrepancy which could be alleviated if there is an unknown reservoir of nitrogen in comets. The nucleus of comet 67P/Churyumov-Gerasimenko exhibits an unidentified broad spectral reflectance feature around 3.2 micrometers, which is ubiquitous across its surface. On the basis of laboratory experiments, we attribute this absorption band to ammonium salts mixed with dust on the surface. The depth of the band indicates that semivolatile ammonium salts are a substantial reservoir of nitrogen in the comet, potentially dominating over refractory organic matter and more volatile species. Similar absorption features appear in the spectra of some asteroids, implying a compositional link between asteroids, comets, and the parent interstellar cloud. △ Less

Submitted 12 March, 2020; originally announced March 2020.

Comments: Main manuscript and Supplementary material document, Accepted for publication in Science on February 14, 2020

Journal ref: Science 367, (2020) eaaw7462

arXiv:2003.01431 [pdf, other]

doi 10.3389/fnbot.2019.00081

Embodied Synaptic Plasticity with Online Reinforcement learning

Authors: Jacques Kaiser, Michael Hoff, Andreas Konle, J. Camilo Vasquez Tieck, David Kappel, Daniel Reichard, Anand Subramoney, Robert Legenstein, Arne Roennau, Wolfgang Maass, Rudiger Dillmann

Abstract: The endeavor to understand the brain involves multiple collaborating research fields. Classically, synaptic plasticity rules derived by theoretical neuroscientists are evaluated in isolation on pattern classification tasks. This contrasts with the biological brain which purpose is to control a body in closed-loop. This paper contributes to bringing the fields of computational neuroscience and robo… ▽ More The endeavor to understand the brain involves multiple collaborating research fields. Classically, synaptic plasticity rules derived by theoretical neuroscientists are evaluated in isolation on pattern classification tasks. This contrasts with the biological brain which purpose is to control a body in closed-loop. This paper contributes to bringing the fields of computational neuroscience and robotics closer together by integrating open-source software components from these two fields. The resulting framework allows to evaluate the validity of biologically-plausibe plasticity models in closed-loop robotics environments. We demonstrate this framework to evaluate Synaptic Plasticity with Online REinforcement learning (SPORE), a reward-learning rule based on synaptic sampling, on two visuomotor tasks: reaching and lane following. We show that SPORE is capable of learning to perform policies within the course of simulated hours for both tasks. Provisional parameter explorations indicate that the learning rate and the temperature driving the stochastic processes that govern synaptic learning dynamics need to be regulated for performance improvements to be retained. We conclude by discussing the recent deep reinforcement learning techniques which would be beneficial to increase the functionality of SPORE on visuomotor tasks. △ Less

Submitted 3 March, 2020; originally announced March 2020.

Comments: 18 pages, 5 figures, published in frontiers in neurorobotics

Journal ref: Frontiers in neurorobotics, volume 13, p81, 2019

arXiv:1912.12047 [pdf, other]

Structural plasticity on an accelerated analog neuromorphic hardware system

Authors: Sebastian Billaudelle, Benjamin Cramer, Mihai A. Petrovici, Korbinian Schreiber, David Kappel, Johannes Schemmel, Karlheinz Meier

Abstract: In computational neuroscience, as well as in machine learning, neuromorphic devices promise an accelerated and scalable alternative to neural network simulations. Their neural connectivity and synaptic capacity depends on their specific design choices, but is always intrinsically limited. Here, we present a strategy to achieve structural plasticity that optimizes resource allocation under these co… ▽ More In computational neuroscience, as well as in machine learning, neuromorphic devices promise an accelerated and scalable alternative to neural network simulations. Their neural connectivity and synaptic capacity depends on their specific design choices, but is always intrinsically limited. Here, we present a strategy to achieve structural plasticity that optimizes resource allocation under these constraints by constantly rewiring the pre- and gpostsynaptic partners while keeping the neuronal fan-in constant and the connectome sparse. In particular, we implemented this algorithm on the analog neuromorphic system BrainScaleS-2. It was executed on a custom embedded digital processor located on chip, accompanying the mixed-signal substrate of spiking neurons and synapse circuits. We evaluated our implementation in a simple supervised learning scenario, showing its ability to optimize the network topology with respect to the nature of its training data, as well as its overall computational efficiency. △ Less

Submitted 30 September, 2020; v1 submitted 27 December, 2019; originally announced December 2019.

arXiv:1911.05990 [pdf, other]

Attention on Abstract Visual Reasoning

Authors: Lukas Hahne, Timo Lüddecke, Florentin Wörgötter, David Kappel

Abstract: Attention mechanisms have been boosting the performance of deep learning models on a wide range of applications, ranging from speech understanding to program induction. However, despite experiments from psychology which suggest that attention plays an essential role in visual reasoning, the full potential of attention mechanisms has so far not been explored to solve abstract cognitive tasks on ima… ▽ More Attention mechanisms have been boosting the performance of deep learning models on a wide range of applications, ranging from speech understanding to program induction. However, despite experiments from psychology which suggest that attention plays an essential role in visual reasoning, the full potential of attention mechanisms has so far not been explored to solve abstract cognitive tasks on image data. In this work, we propose a hybrid network architecture, grounded on self-attention and relational reasoning. We call this new model Attention Relation Network (ARNe). ARNe combines features from the recently introduced Transformer and the Wild Relation Network (WReN). We test ARNe on the Procedurally Generated Matrices (PGMs) datasets for abstract visual reasoning. ARNe excels the WReN model on this task by 11.28 ppt. Relational concepts between objects are efficiently learned demanding only 35% of the training samples to surpass reported accuracy of the base line model. Our proposed hybrid model, represents an alternative on learning abstract relations using self-attention and demonstrates that the Transformer network is also well suited for abstract visual reasoning. △ Less

Submitted 14 November, 2019; originally announced November 2019.

arXiv:1905.03022 [pdf, other]

doi 10.1051/0004-6361/201834869

Diurnal variation of dust and gas production in comet 67P/Churyumov-Gerasimenko at the inbound equinox as seen by OSIRIS and VIRTIS-M on board Rosetta

Authors: C. Tubiana, G. Rinaldi, C. Güttler, C. Snodgrass, X. Shi, X. Hu, R. Marschall, M. Fulle, D. Bockelée-Morvan, G. Naletto, F. Capaccioni, H. Sierks, G. Arnold, M. A. Barucci, J. -L. Bertaux, I. Bertini, D. Bodewits, M. T. Capria, M. Ciarniello, G. Cremonese, J. Crovisier, V. Da Deppo, S. Debei, M. De Cecco, J. Deller , et al. (31 additional authors not shown)

Abstract: On 27 Apr 2015, when 67P/C-G was at 1.76 au from the Sun and moving towards perihelion, the OSIRIS and VIRTIS-M instruments on Rosetta observed the evolving dust and gas coma during a complete rotation of the comet. We aim to characterize the dust, H2O and CO2 gas spatial distribution in the inner coma. To do this we performed a quantitative analysis of the release of dust and gas and compared the… ▽ More On 27 Apr 2015, when 67P/C-G was at 1.76 au from the Sun and moving towards perihelion, the OSIRIS and VIRTIS-M instruments on Rosetta observed the evolving dust and gas coma during a complete rotation of the comet. We aim to characterize the dust, H2O and CO2 gas spatial distribution in the inner coma. To do this we performed a quantitative analysis of the release of dust and gas and compared the observed H2O production rate with the one calculated using a thermo-physical model. For this study we selected OSIRIS WAC images at 612 nm (dust) and VIRTIS-M image cubes at 612 nm, 2700 nm (H2O) and 4200 nm (CO2). We measured the average signal in a circular annulus, to study spatial variation around the comet, and in a sector of the annulus, to study temporal variation in the sunward direction with comet rotation, both at a fixed distance of 3.1 km from the comet centre. The spatial correlation between dust and water, both coming from the sun-lit side of the comet, shows that water is the main driver of dust activity in this time period. The spatial distribution of CO2 is not correlated with water and dust. There is no strong temporal correlation between the dust brightness and water production rate as the comet rotates. The dust brightness shows a peak at 0deg sub-solar longitude, which is not pronounced in the water production. At the same epoch, there is also a maximum in CO2 production. An excess of measured water production, with respect to the value calculated using a simple thermo-physical model, is observed when the head lobe and regions of the Southern hemisphere with strong seasonal variations are illuminated. A drastic decrease in dust production, when the water production (both measured and from the model) displays a maximum, happens when typical Northern consolidated regions are illuminated and the Southern hemisphere regions with strong seasonal variations are instead in shadow. △ Less

Submitted 8 May, 2019; originally announced May 2019.

Comments: 15 pages, accepted for publication in A&A

arXiv:1903.08500 [pdf, other]

doi 10.1109/TBCAS.2019.2906401

Efficient Reward-Based Structural Plasticity on a SpiNNaker 2 Prototype

Authors: Yexin Yan, David Kappel, Felix Neumaerker, Johannes Partzsch, Bernhard Vogginger, Sebastian Hoeppner, Steve Furber, Wolfgang Maass, Robert Legenstein, Christian Mayr

Abstract: Advances in neuroscience uncover the mechanisms employed by the brain to efficiently solve complex learning tasks with very limited resources. However, the efficiency is often lost when one tries to port these findings to a silicon substrate, since brain-inspired algorithms often make extensive use of complex functions such as random number generators, that are expensive to compute on standard gen… ▽ More Advances in neuroscience uncover the mechanisms employed by the brain to efficiently solve complex learning tasks with very limited resources. However, the efficiency is often lost when one tries to port these findings to a silicon substrate, since brain-inspired algorithms often make extensive use of complex functions such as random number generators, that are expensive to compute on standard general purpose hardware. The prototype chip of the 2nd generation SpiNNaker system is designed to overcome this problem. Low-power ARM processors equipped with a random number generator and an exponential function accelerator enable the efficient execution of brain-inspired algorithms. We implement the recently introduced reward-based synaptic sampling model that employs structural plasticity to learn a function or task. The numerical simulation of the model requires to update the synapse variables in each time step including an explorative random term. To the best of our knowledge, this is the most complex synapse model implemented so far on the SpiNNaker system. By making efficient use of the hardware accelerators and numerical optimizations the computation time of one plasticity update is reduced by a factor of 2. This, combined with fitting the model into to the local SRAM, leads to 62% energy reduction compared to the case without accelerators and the use of external DRAM. The model implementation is integrated into the SpiNNaker software framework allowing for scalability onto larger systems. The hardware-software system presented in this work paves the way for power-efficient mobile and biomedical applications with biologically plausible brain-inspired algorithms. △ Less

Submitted 20 March, 2019; originally announced March 2019.

Comments: accepted by IEEE TBioCAS

arXiv:1901.03074 [pdf, other]

doi 10.1051/0004-6361/201834764

VIRTIS-H observations of comet 67P's dust coma: spectral properties and color temperature variability with phase and elevation

Authors: D. Bockelée-Morvan, C. Leyrat, S. Erard, F. Andrieu, F. Capaccioni, G. Filacchione, P. H. Hasselmann, J. Crovisier, P. Drossart, G. Arnold, M. Ciarniello, D. Kappel, A. Longobardo, M. -T. Capria, M. C. De Sanctis, G. Rinaldi, F. Taylor

Abstract: We analyze 2-5 micrometre spectroscopic observations of the dust coma of comet 67P/Churyumov-Gerasimenko obtained with the VIRTIS-H instrument onboard Rosetta from 3 June to 29 October 2015 at heliocentric distances r_h = 1.24-1.55 AU. The 2-2.5 micrometre color, bolometric albedo, and color temperature are measured using spectral fitting. Data obtained at alpha = 90° solar phase angle show an inc… ▽ More We analyze 2-5 micrometre spectroscopic observations of the dust coma of comet 67P/Churyumov-Gerasimenko obtained with the VIRTIS-H instrument onboard Rosetta from 3 June to 29 October 2015 at heliocentric distances r_h = 1.24-1.55 AU. The 2-2.5 micrometre color, bolometric albedo, and color temperature are measured using spectral fitting. Data obtained at alpha = 90° solar phase angle show an increase of the bolometric albedo (0.05 to 0.14) with increasing altitude (0.5 to 8 km), accompanied by a possible marginal decrease of the color and color temperature. Possible explanations include the presence in the inner coma of dark particles on ballistic trajectories, and radial changes in particle composition. In the phase angle range 50-120°, phase reddening is significant (0.031 %/100 nm/°), for a mean color of 2 %/100 nm at alpha = 90°, that can be related to the roughness of the dust particles. Moreover, a decrease of the color temperature with decreasing phase angle is also observed at a rate of ~ 0.3 K/°, consistent with the presence of large porous particles, with low thermal inertia, and showing a significant day-to-night temperature contrast. Comparing data acquired at fixed phase angle (alpha = 90°), a 20% increase of the bolometric albedo is observed near perihelion. Heliocentric variations of the dust color are not significant in the analyzed time period. Measured color temperatures are varying from 260 to 320 K, and follow a r^0.6 variation in the r_h = 1.24-1.5 AU range, close to the expected r_h^0.5 value. △ Less

Submitted 10 January, 2019; originally announced January 2019.

Comments: 13 pages, 11 figures. Accepted for publication in Astronomy & Astrophysics

Journal ref: A&A 630, A22 (2019)

arXiv:1711.09746 [pdf]

doi 10.1016/j.icarus.2017.10.015

Laboratory simulations of the Vis-NIR spectra of comet 67P using sub-μm sized cosmochemical analogues

Authors: Batiste Rousseau, Stéphane Érard, Pierre Beck, Éric Quirico, Bernard Schmitt, Olivier Brissaud, German Montes-Hernandez, Fabrizio Capaccioni, Gianrico Filacchione, Dominique Bockelée-Morvan, Cédric Leyrat, Mauro Ciarniello, Andrea Raponi, David Kappel, Gabriele Arnold, Ljuba V Moroz, Ernesto Palomba, Federico Tosi

Abstract: Laboratory spectral measurements of relevant analogue materials were performed in the framework of the Rosetta mission in order to explain the surface spectral properties of comet 67P. Fine powders of coal, iron sulphides, silicates and their mixtures were prepared and their spectra measured in the Vis-IR range. These spectra are compared to a reference spectrum of 67P nucleus obtained with the VI… ▽ More Laboratory spectral measurements of relevant analogue materials were performed in the framework of the Rosetta mission in order to explain the surface spectral properties of comet 67P. Fine powders of coal, iron sulphides, silicates and their mixtures were prepared and their spectra measured in the Vis-IR range. These spectra are compared to a reference spectrum of 67P nucleus obtained with the VIRTIS/Rosetta instrument up to 2.7 μm, excluding the organics band centred at 3.2 μm. The species used are known to be chemical analogues for cometary materials which could be present at the surface of 67P. Grain sizes of the powders range from tens of nanometres to hundreds of micrometres. Some of the mixtures studied here actually reach the very low reflectance level observed by VIRTIS on 67P. The best match is provided by a mixture of sub-micron coal, pyrrhotite, and silicates. Grain sizes are in agreement with the sizes of the dust particles detected by the GIADA, MIDAS and COSIMA instruments on board Rosetta. The coal used in the experiment is responsible for the spectral slope in the visible and infrared ranges. Pyrrhotite, which is strongly absorbing, is responsible for the low albedo observed in the NIR. The darkest components dominate the spectra, especially within intimate mixtures. Depending on sample preparation, pyrrhotite can coat the coal and silicate aggregates. Such coating effects can affect the spectra as much as particle size. In contrast, silicates seem to play a minor role. △ Less

Submitted 27 November, 2017; originally announced November 2017.

Comments: 12 pages, 11 figures, 2 tables

Journal ref: Icarus 2018 306:318

arXiv:1711.05136 [pdf, ps, other]

Deep Rewiring: Training very sparse deep networks

Authors: Guillaume Bellec, David Kappel, Wolfgang Maass, Robert Legenstein

Abstract: Neuromorphic hardware tends to pose limits on the connectivity of deep networks that one can run on them. But also generic hardware and software implementations of deep learning run more efficiently for sparse networks. Several methods exist for pruning connections of a neural network after it was trained without connectivity constraints. We present an algorithm, DEEP R, that enables us to train d… ▽ More Neuromorphic hardware tends to pose limits on the connectivity of deep networks that one can run on them. But also generic hardware and software implementations of deep learning run more efficiently for sparse networks. Several methods exist for pruning connections of a neural network after it was trained without connectivity constraints. We present an algorithm, DEEP R, that enables us to train directly a sparsely connected neural network. DEEP R automatically rewires the network during supervised training so that connections are there where they are most needed for the task, while its total number is all the time strictly bounded. We demonstrate that DEEP R can be used to train very sparse feedforward and recurrent neural networks on standard benchmark tasks with just a minor loss in performance. DEEP R is based on a rigorous theoretical foundation that views rewiring as stochastic sampling of network configurations from a posterior. △ Less

Submitted 7 August, 2018; v1 submitted 14 November, 2017; originally announced November 2017.

Comments: Accepted for publication at ICLR 2018. 10 pages (12 with references, 24 with appendix), 4 Figures in the main text. Reviews are available at: https://openreview.net/forum?id=BJ_wN01C- . This recent version contains minor corrections in the appendix

arXiv:1704.04238 [pdf, other]

A dynamic connectome supports the emergence of stable computational function of neural circuits through reward-based learning

Authors: David Kappel, Robert Legenstein, Stefan Habenschuss, Michael Hsieh, Wolfgang Maass

Abstract: Synaptic connections between neurons in the brain are dynamic because of continuously ongoing spine dynamics, axonal sprouting, and other processes. In fact, it was recently shown that the spontaneous synapse-autonomous component of spine dynamics is at least as large as the component that depends on the history of pre- and postsynaptic neural activity. These data are inconsistent with common mode… ▽ More Synaptic connections between neurons in the brain are dynamic because of continuously ongoing spine dynamics, axonal sprouting, and other processes. In fact, it was recently shown that the spontaneous synapse-autonomous component of spine dynamics is at least as large as the component that depends on the history of pre- and postsynaptic neural activity. These data are inconsistent with common models for network plasticity, and raise the questions how neural circuits can maintain a stable computational function in spite of these continuously ongoing processes, and what functional uses these ongoing processes might have. Here, we present a rigorous theoretical framework for these seemingly stochastic spine dynamics and rewiring processes in the context of reward-based learning tasks. We show that spontaneous synapse-autonomous processes, in combination with reward signals such as dopamine, can explain the capability of networks of neurons in the brain to configure themselves for specific computational tasks, and to compensate automatically for later changes in the network or task. Furthermore we show theoretically and through computer simulations that stable computational performance is compatible with continuously ongoing synapse-autonomous changes. After reaching good computational performance it causes primarily a slow drift of network architecture and dynamics in task-irrelevant dimensions, as observed for neural activity in motor cortex and other areas. On the more abstract level of reinforcement learning the resulting model gives rise to an understanding of reward-driven network plasticity as continuous sampling of network configurations. △ Less

Submitted 5 January, 2018; v1 submitted 13 April, 2017; originally announced April 2017.

arXiv:1612.02231 [pdf]

doi 10.1093/mnras/stw3281

The temporal evolution of exposed water ice-rich areas on the surface of 67P/Churyumov-Gerasimenko: spectral analysis

Authors: A. Raponi, M. Ciarniello, F. Capaccioni, G. Filacchione, F. Tosi, M. C. De Sanctis, M. T. Capria, M. A. Barucci, A. Longobardo, E. Palomba, D. Kappel, G. Arnold, S. Mottola, B. Rousseau, E. Quirico, G. Rinaldi, S. Erard, D. Bockelee-Morvan, C. Leyrat

Abstract: Water ice-rich patches have been detected on the surface of comet 67P/Churyumov-Gerasimenko by the VIRTIS hyperspectral imager on-board the Rosetta spacecraft, since the orbital insertion in late August 2014. Among those, three icy patches have been selected, and VIRTIS data are used to analyse their properties and their temporal evolution while the comet was moving towards the Sun. We performed a… ▽ More Water ice-rich patches have been detected on the surface of comet 67P/Churyumov-Gerasimenko by the VIRTIS hyperspectral imager on-board the Rosetta spacecraft, since the orbital insertion in late August 2014. Among those, three icy patches have been selected, and VIRTIS data are used to analyse their properties and their temporal evolution while the comet was moving towards the Sun. We performed an extensive analysis of the spectral parameters, and we applied the Hapke radiative transfer model to retrieve the abundance and grain size of water ice, as well as the mixing modalities of water ice and dark terrains on the three selected water ice rich areas. Study of the spatial distribution of the spectral parameters within the ice-rich patches has revealed that water ice follows different patterns associated to a bimodal distribution of the grains: ~50 μm sized and ~2000 μm sized. In all three cases, after the first detections at about 3.5 AU heliocentric distance, the spatial extension and intensity of the water ice spectral features increased, it reached a maximum after 60-100 days at about 3.0 AU, and was followed by an approximately equally timed decrease and disappearanceat about ~2.2 AU, before perihelion. The behaviour of the analysed patches can be assimilated to a seasonal cycle. In addition we found evidence of short-term variability associated to a diurnal water cycle. The similar lifecycle of the three icy regions indicates that water ice is uniformly distributed in the subsurface layers, and no large water ice reservoirs are present. △ Less

Submitted 7 December, 2016; originally announced December 2016.

Comments: submitted to MNRAS

arXiv:1606.00157 [pdf, other]

CaMKII activation supports reward-based neural network optimization through Hamiltonian sampling

Authors: Zhaofei Yu, David Kappel, Robert Legenstein, Sen Song, Feng Chen, Wolfgang Maass

Abstract: Synaptic plasticity is implemented and controlled through over thousand different types of molecules in the postsynaptic density and presynaptic boutons that assume a staggering array of different states through phosporylation and other mechanisms. One of the most prominent molecule in the postsynaptic density is CaMKII, that is described in molecular biology as a "memory molecule" that can integr… ▽ More Synaptic plasticity is implemented and controlled through over thousand different types of molecules in the postsynaptic density and presynaptic boutons that assume a staggering array of different states through phosporylation and other mechanisms. One of the most prominent molecule in the postsynaptic density is CaMKII, that is described in molecular biology as a "memory molecule" that can integrate through auto-phosporylation Ca-influx signals on a relatively large time scale of dozens of seconds. The functional impact of this memory mechanism is largely unknown. We show that the experimental data on the specific role of CaMKII activation in dopamine-gated spine consolidation suggest a general functional role in speeding up reward-guided search for network configurations that maximize reward expectation. Our theoretical analysis shows that stochastic search could in principle even attain optimal network configurations by emulating one of the most well-known nonlinear optimization methods, simulated annealing. But this optimization is usually impeded by slowness of stochastic search at a given temperature. We propose that CaMKII contributes a momentum term that substantially speeds up this search. In particular, it allows the network to overcome saddle points of the fitness function. The resulting improved stochastic policy search can be understood on a more abstract level as Hamiltonian sampling, which is known to be one of the most efficient stochastic search methods. △ Less

Submitted 15 May, 2018; v1 submitted 1 June, 2016; originally announced June 2016.

Comments: 27 pages, 5 figures

arXiv:1602.09098 [pdf, other]

doi 10.1016/j.icarus.2016.02.055

The global surface composition of 67P/CG nucleus by Rosetta/VIRTIS. I) Prelanding mission phase

Authors: Gianrico Filacchione, Fabrizio Capaccioni, Mauro Ciarniello, Andrea Raponi, Federico Tosi, Maria Cristina De Sanctis, Stephane Erard, Dominique Bockelee Morvan, Cedric Leyrat, Gabriele Arnold, Bernard Schmitt, Eric Quirico, Giuseppe Piccioni, Alessandra Migliorini, Maria Teresa Capria, Ernesto Palomba, Priscilla Cerroni, Andrea Longobardo, Antonella Barucci, Sonia Fornasier, Robert W. Carlson, Ralf Jaumann, Katrin Stephan, Lyuba V. Moroz, David Kappel , et al. (5 additional authors not shown)

Abstract: From August to November 2014 the Rosetta orbiter has performed an extensive observation campaign aimed at the characterization of 67P/CG nucleus properties and to the selection of the Philae landing site. The campaign led to the production of a global map of the illuminated portion of 67P/CG nucleus. During this prelanding phase the comet's heliocentric distance decreased from 3.62 to 2.93 AU whil… ▽ More From August to November 2014 the Rosetta orbiter has performed an extensive observation campaign aimed at the characterization of 67P/CG nucleus properties and to the selection of the Philae landing site. The campaign led to the production of a global map of the illuminated portion of 67P/CG nucleus. During this prelanding phase the comet's heliocentric distance decreased from 3.62 to 2.93 AU while Rosetta was orbiting around the nucleus at distances between 100 to 10 km. VIRTIS-M, the Visible and InfraRed Thermal Imaging Spectrometer - Mapping channel (Coradini et al. 2007) onboard the orbiter, has acquired 0.25-5.1 micron hyperspectral data of the entire illuminated surface, e.g. the north hemisphere and the equatorial regions, with spatial resolution between 2.5 and 25 m/pixel. I/F spectra have been corrected for thermal emission removal in the 3.5-5.1 micron range and for surface's photometric response. The resulting reflectance spectra have been used to compute several Cometary Spectral Indicators (CSI): single scattering albedo at 0.55 micron, 0.5-0.8 micron and 1.0-2.5 micron spectral slopes, 3.2 micron organic material and 2.0 micron water ice band parameters (center, depth) with the aim to map their spatial distribution on the surface and to study their temporal variability as the nucleus moved towards the Sun. Indeed, throughout the investigated period, the nucleus surface shows a significant increase of the single scattering albedo along with a decrease of the 0.5-0.8 and 1.0-2.5 micron spectral slopes, indicating a flattening of the reflectance. We attribute the origin of this effect to the partial removal of the dust layer caused by the increased contribution of water sublimation to the gaseous activity as comet crossed the frost-line. △ Less

Submitted 29 February, 2016; originally announced February 2016.

Comments: 19 Figures, 5 Tables. Accepted for publication in Icarus journal on 29 February 2016

arXiv:1504.05143 [pdf, other]

doi 10.1371/journal.pcbi.1004485

Network Plasticity as Bayesian Inference

Authors: David Kappel, Stefan Habenschuss, Robert Legenstein, Wolfgang Maass

Abstract: General results from statistical learning theory suggest to understand not only brain computations, but also brain plasticity as probabilistic inference. But a model for that has been missing. We propose that inherently stochastic features of synaptic plasticity and spine motility enable cortical networks of neurons to carry out probabilistic inference by sampling from a posterior distribution of… ▽ More General results from statistical learning theory suggest to understand not only brain computations, but also brain plasticity as probabilistic inference. But a model for that has been missing. We propose that inherently stochastic features of synaptic plasticity and spine motility enable cortical networks of neurons to carry out probabilistic inference by sampling from a posterior distribution of network configurations. This model provides a viable alternative to existing models that propose convergence of parameters to maximum likelihood values. It explains how priors on weight distributions and connection probabilities can be merged optimally with learned experience, how cortical networks can generalize learned information so well to novel experiences, and how they can compensate continuously for unforeseen disturbances of the network. The resulting new theory of network plasticity explains from a functional perspective a number of experimental data on stochastic aspects of synaptic plasticity that previously appeared to be quite puzzling. △ Less

Submitted 20 April, 2015; originally announced April 2015.

Comments: 33 pages, 5 figures, the supplement is available on the author's web page http://www.igi.tugraz.at/kappel

Showing 1–31 of 31 results for author: Kappel, D