-
Model-free front-to-end training of a large high performance laser neural network
Authors:
Anas Skalli,
Satoshi Sunada,
Mirko Goldmann,
Marcin Gebski,
Stephan Reitzenstein,
James A. Lott,
Tomasz Czyszanowski,
Daniel Brunner
Abstract:
Artificial neural networks (ANNs), have become ubiquitous and revolutionized many applications ranging from computer vision to medical diagnoses. However, they offer a fundamentally connectionist and distributed approach to computing, in stark contrast to classical computers that use the von Neumann architecture. This distinction has sparked renewed interest in developing unconventional hardware t…
▽ More
Artificial neural networks (ANNs), have become ubiquitous and revolutionized many applications ranging from computer vision to medical diagnoses. However, they offer a fundamentally connectionist and distributed approach to computing, in stark contrast to classical computers that use the von Neumann architecture. This distinction has sparked renewed interest in developing unconventional hardware to support more efficient implementations of ANNs, rather than merely emulating them on traditional systems. Photonics stands out as a particularly promising platform, providing scalability, high speed, energy efficiency, and the ability for parallel information processing. However, fully realized autonomous optical neural networks (ONNs) with in-situ learning capabilities are still rare. In this work, we demonstrate a fully autonomous and parallel ONN using a multimode vertical cavity surface emitting laser (VCSEL) using off-the-shelf components. Our ONN is highly efficient and is scalable both in network size and inference bandwidth towards the GHz range. High performance hardware-compatible optimization algorithms are necessary in order to minimize reliance on external von Neumann computers to fully exploit the potential of ONNs. As such we present and extensively study several algorithms which are broadly compatible with a wide range of systems. We then apply these algorithms to optimize our ONN, and benchmark them using the MNIST dataset. We show that our ONN can achieve high accuracy and convergence efficiency, even under limited hardware resources. Crucially, we compare these different algorithms in terms of scaling and optimization efficiency in term of convergence time which is crucial when working with limited external resources. Our work provides some guidance for the design of future ONNs as well as a simple and flexible way to train them.
△ Less
Submitted 21 March, 2025;
originally announced March 2025.
-
Roadmap on Neuromorphic Photonics
Authors:
Daniel Brunner,
Bhavin J. Shastri,
Mohammed A. Al Qadasi,
H. Ballani,
Sylvain Barbay,
Stefano Biasi,
Peter Bienstman,
Simon Bilodeau,
Wim Bogaerts,
Fabian Böhm,
G. Brennan,
Sonia Buckley,
Xinlun Cai,
Marcello Calvanese Strinati,
B. Canakci,
Benoit Charbonnier,
Mario Chemnitz,
Yitong Chen,
Stanley Cheung,
Jeff Chiles,
Suyeon Choi,
Demetrios N. Christodoulides,
Lukas Chrostowski,
J. Chu,
J. H. Clegg
, et al. (125 additional authors not shown)
Abstract:
This roadmap consolidates recent advances while exploring emerging applications, reflecting the remarkable diversity of hardware platforms, neuromorphic concepts, and implementation philosophies reported in the field. It emphasizes the critical role of cross-disciplinary collaboration in this rapidly evolving field.
This roadmap consolidates recent advances while exploring emerging applications, reflecting the remarkable diversity of hardware platforms, neuromorphic concepts, and implementation philosophies reported in the field. It emphasizes the critical role of cross-disciplinary collaboration in this rapidly evolving field.
△ Less
Submitted 16 January, 2025; v1 submitted 14 January, 2025;
originally announced January 2025.
-
Annealing-inspired training of an optical neural network with ternary weights
Authors:
Anas Skalli,
Mirko Goldmann,
Nasibeh Haghighi,
Stephan Reitzenstein,
James A. Lott,
Daniel Brunner
Abstract:
Artificial neural networks (ANNs) represent a fundamentally connectionnist and distributed approach to computing, and as such they differ from classical computers that utilize the von Neumann architecture. This has revived research interest in new unconventional hardware to enable more efficient implementations of ANNs rather than emulating them on traditional machines. In order to fully leverage…
▽ More
Artificial neural networks (ANNs) represent a fundamentally connectionnist and distributed approach to computing, and as such they differ from classical computers that utilize the von Neumann architecture. This has revived research interest in new unconventional hardware to enable more efficient implementations of ANNs rather than emulating them on traditional machines. In order to fully leverage the capabilities of this new generation of ANNs, optimization algorithms that take into account hardware limitations and imperfections are necessary. Photonics represents a particularly promising platform, offering scalability, high speed, energy efficiency, and the capability for parallel information processing. Yet, fully fledged implementations of autonomous optical neural networks (ONNs) with in-situ learning remain scarce. In this work, we propose a ternary weight architecture high-dimensional semiconductor laser-based ONN. We introduce a simple method for achieving ternary weights with Boolean hardware, significantly increasing the ONN's information processing capabilities. Furthermore, we design a novel in-situ optimization algorithm that is compatible with, both, Boolean and ternary weights, and provide a detailed hyperparameter study of said algorithm for two different tasks. Our novel algorithm results in benefits, both in terms of convergence speed and performance. Finally, we experimentally characterize the long-term inference stability of our ONN and find that it is extremely stable with a consistency above 99\% over a period of more than 10 hours, addressing one of the main concerns in the field. Our work is of particular relevance in the context of in-situ learning under restricted hardware resources, especially since minimizing the power consumption of auxiliary hardware is crucial to preserving efficiency gains achieved by non-von Neumann ANN implementations.
△ Less
Submitted 2 October, 2024; v1 submitted 2 September, 2024;
originally announced September 2024.
-
Adaptive control of recurrent neural networks using conceptors
Authors:
Guillaume Pourcel,
Mirko Goldmann,
Ingo Fischer,
Miguel C. Soriano
Abstract:
Recurrent Neural Networks excel at predicting and generating complex high-dimensional temporal patterns. Due to their inherent nonlinear dynamics and memory, they can learn unbounded temporal dependencies from data. In a Machine Learning setting, the network's parameters are adapted during a training phase to match the requirements of a given task/problem increasing its computational capabilities.…
▽ More
Recurrent Neural Networks excel at predicting and generating complex high-dimensional temporal patterns. Due to their inherent nonlinear dynamics and memory, they can learn unbounded temporal dependencies from data. In a Machine Learning setting, the network's parameters are adapted during a training phase to match the requirements of a given task/problem increasing its computational capabilities. After the training, the network parameters are kept fixed to exploit the learned computations. The static parameters thereby render the network unadaptive to changing conditions, such as external or internal perturbation. In this manuscript, we demonstrate how keeping parts of the network adaptive even after the training enhances its functionality and robustness. Here, we utilize the conceptor framework and conceptualize an adaptive control loop analyzing the network's behavior continuously and adjusting its time-varying internal representation to follow a desired target. We demonstrate how the added adaptivity of the network supports the computational functionality in three distinct tasks: interpolation of temporal patterns, stabilization against partial network degradation, and robustness against input distortion. Our results highlight the potential of adaptive networks in machine learning beyond training, enabling them to not only learn complex patterns but also dynamically adjust to changing environments, ultimately broadening their applicability.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
A gradient-based approach to fast and accurate head motion compensation in cone-beam CT
Authors:
Mareike Thies,
Fabian Wagner,
Noah Maul,
Haijun Yu,
Manuela Goldmann,
Linda-Sophie Schneider,
Mingxuan Gu,
Siyuan Mei,
Lukas Folle,
Alexander Preuhs,
Michael Manhart,
Andreas Maier
Abstract:
Cone-beam computed tomography (CBCT) systems, with their flexibility, present a promising avenue for direct point-of-care medical imaging, particularly in critical scenarios such as acute stroke assessment. However, the integration of CBCT into clinical workflows faces challenges, primarily linked to long scan duration resulting in patient motion during scanning and leading to image quality degrad…
▽ More
Cone-beam computed tomography (CBCT) systems, with their flexibility, present a promising avenue for direct point-of-care medical imaging, particularly in critical scenarios such as acute stroke assessment. However, the integration of CBCT into clinical workflows faces challenges, primarily linked to long scan duration resulting in patient motion during scanning and leading to image quality degradation in the reconstructed volumes. This paper introduces a novel approach to CBCT motion estimation using a gradient-based optimization algorithm, which leverages generalized derivatives of the backprojection operator for cone-beam CT geometries. Building on that, a fully differentiable target function is formulated which grades the quality of the current motion estimate in reconstruction space. We drastically accelerate motion estimation yielding a 19-fold speed-up compared to existing methods. Additionally, we investigate the architecture of networks used for quality metric regression and propose predicting voxel-wise quality maps, favoring autoencoder-like architectures over contracting ones. This modification improves gradient flow, leading to more accurate motion estimation. The presented method is evaluated through realistic experiments on head anatomy. It achieves a reduction in reprojection error from an initial average of 3mm to 0.61mm after motion compensation and consistently demonstrates superior performance compared to existing approaches. The analytic Jacobian for the backprojection operation, which is at the core of the proposed method, is made publicly available. In summary, this paper contributes to the advancement of CBCT integration into clinical workflows by proposing a robust motion estimation approach that enhances efficiency and accuracy, addressing critical challenges in time-sensitive scenarios.
△ Less
Submitted 21 October, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
Learn one size to infer all: Exploiting translational symmetries in delay-dynamical and spatio-temporal systems using scalable neural networks
Authors:
Mirko Goldmann,
Claudio R. Mirasso,
Ingo Fischer,
Miguel C. Soriano
Abstract:
We design scalable neural networks adapted to translational symmetries in dynamical systems, capable of inferring untrained high-dimensional dynamics for different system sizes. We train these networks to predict the dynamics of delay-dynamical and spatio-temporal systems for a single size. Then, we drive the networks by their own predictions. We demonstrate that by scaling the size of the trained…
▽ More
We design scalable neural networks adapted to translational symmetries in dynamical systems, capable of inferring untrained high-dimensional dynamics for different system sizes. We train these networks to predict the dynamics of delay-dynamical and spatio-temporal systems for a single size. Then, we drive the networks by their own predictions. We demonstrate that by scaling the size of the trained network, we can predict the complex dynamics for larger or smaller system sizes. Thus, the network learns from a single example and, by exploiting symmetry properties, infers entire bifurcation diagrams.
△ Less
Submitted 5 July, 2024; v1 submitted 5 November, 2021;
originally announced November 2021.
-
Deep Time-Delay Reservoir Computing: Dynamics and Memory Capacity
Authors:
Mirko Goldmann,
Felix Köster,
Kathy Lüdge,
Serhiy Yanchuk
Abstract:
The Deep Time-Delay Reservoir Computing concept utilizes unidirectionally connected systems with time-delays for supervised learning. We present how the dynamical properties of a deep Ikeda-based reservoir are related to its memory capacity (MC) and how that can be used for optimization. In particular, we analyze bifurcations of the corresponding autonomous system and compute conditional Lyapunov…
▽ More
The Deep Time-Delay Reservoir Computing concept utilizes unidirectionally connected systems with time-delays for supervised learning. We present how the dynamical properties of a deep Ikeda-based reservoir are related to its memory capacity (MC) and how that can be used for optimization. In particular, we analyze bifurcations of the corresponding autonomous system and compute conditional Lyapunov exponents, which measure the generalized synchronization between the input and the layer dynamics. We show how the MC is related to the systems distance to bifurcations or magnitude of the conditional Lyapunov exponent. The interplay of different dynamical regimes leads to a adjustable distribution between linear and nonlinear MC. Furthermore, numerical simulations show resonances between clock cycle and delays of the layers in all degrees of the MC. Contrary to MC losses in a single-layer reservoirs, these resonances can boost separate degrees of the MC and can be used, e.g., to design a system with maximum linear MC. Accordingly, we present two configurations that empower either high nonlinear MC or long time linear MC.
△ Less
Submitted 25 August, 2020; v1 submitted 11 June, 2020;
originally announced June 2020.