-
Leveraging chaos in the training of artificial neural networks
Authors:
Pedro Jiménez-González,
Miguel C. Soriano,
Lucas Lacasa
Abstract:
Traditional algorithms to optimize artificial neural networks when confronted with a supervised learning task are usually exploitation-type relaxational dynamics such as gradient descent (GD). Here, we explore the dynamics of the neural network trajectory along training for unconventionally large learning rates. We show that for a region of values of the learning rate, the GD optimization shifts a…
▽ More
Traditional algorithms to optimize artificial neural networks when confronted with a supervised learning task are usually exploitation-type relaxational dynamics such as gradient descent (GD). Here, we explore the dynamics of the neural network trajectory along training for unconventionally large learning rates. We show that for a region of values of the learning rate, the GD optimization shifts away from purely exploitation-like algorithm into a regime of exploration-exploitation balance, as the neural network is still capable of learning but the trajectory shows sensitive dependence on initial conditions -- as characterized by positive network maximum Lyapunov exponent --. Interestingly, the characteristic training time required to reach an acceptable accuracy in the test set reaches a minimum precisely in such learning rate region, further suggesting that one can accelerate the training of artificial neural networks by locating at the onset of chaos. Our results -- initially illustrated for the MNIST classification task -- qualitatively hold for a range of supervised learning tasks, learning architectures and other hyperparameters, and showcase the emergent, constructive role of transient chaotic dynamics in the training of artificial neural networks.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Roadmap on Neuromorphic Photonics
Authors:
Daniel Brunner,
Bhavin J. Shastri,
Mohammed A. Al Qadasi,
H. Ballani,
Sylvain Barbay,
Stefano Biasi,
Peter Bienstman,
Simon Bilodeau,
Wim Bogaerts,
Fabian Böhm,
G. Brennan,
Sonia Buckley,
Xinlun Cai,
Marcello Calvanese Strinati,
B. Canakci,
Benoit Charbonnier,
Mario Chemnitz,
Yitong Chen,
Stanley Cheung,
Jeff Chiles,
Suyeon Choi,
Demetrios N. Christodoulides,
Lukas Chrostowski,
J. Chu,
J. H. Clegg
, et al. (125 additional authors not shown)
Abstract:
This roadmap consolidates recent advances while exploring emerging applications, reflecting the remarkable diversity of hardware platforms, neuromorphic concepts, and implementation philosophies reported in the field. It emphasizes the critical role of cross-disciplinary collaboration in this rapidly evolving field.
This roadmap consolidates recent advances while exploring emerging applications, reflecting the remarkable diversity of hardware platforms, neuromorphic concepts, and implementation philosophies reported in the field. It emphasizes the critical role of cross-disciplinary collaboration in this rapidly evolving field.
△ Less
Submitted 16 January, 2025; v1 submitted 14 January, 2025;
originally announced January 2025.
-
Adaptive control of recurrent neural networks using conceptors
Authors:
Guillaume Pourcel,
Mirko Goldmann,
Ingo Fischer,
Miguel C. Soriano
Abstract:
Recurrent Neural Networks excel at predicting and generating complex high-dimensional temporal patterns. Due to their inherent nonlinear dynamics and memory, they can learn unbounded temporal dependencies from data. In a Machine Learning setting, the network's parameters are adapted during a training phase to match the requirements of a given task/problem increasing its computational capabilities.…
▽ More
Recurrent Neural Networks excel at predicting and generating complex high-dimensional temporal patterns. Due to their inherent nonlinear dynamics and memory, they can learn unbounded temporal dependencies from data. In a Machine Learning setting, the network's parameters are adapted during a training phase to match the requirements of a given task/problem increasing its computational capabilities. After the training, the network parameters are kept fixed to exploit the learned computations. The static parameters thereby render the network unadaptive to changing conditions, such as external or internal perturbation. In this manuscript, we demonstrate how keeping parts of the network adaptive even after the training enhances its functionality and robustness. Here, we utilize the conceptor framework and conceptualize an adaptive control loop analyzing the network's behavior continuously and adjusting its time-varying internal representation to follow a desired target. We demonstrate how the added adaptivity of the network supports the computational functionality in three distinct tasks: interpolation of temporal patterns, stabilization against partial network degradation, and robustness against input distortion. Our results highlight the potential of adaptive networks in machine learning beyond training, enabling them to not only learn complex patterns but also dynamically adjust to changing environments, ultimately broadening their applicability.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Dynamical stability and chaos in artificial neural network trajectories along training
Authors:
Kaloyan Danovski,
Miguel C. Soriano,
Lucas Lacasa
Abstract:
The process of training an artificial neural network involves iteratively adapting its parameters so as to minimize the error of the network's prediction, when confronted with a learning task. This iterative change can be naturally interpreted as a trajectory in network space -- a time series of networks -- and thus the training algorithm (e.g. gradient descent optimization of a suitable loss func…
▽ More
The process of training an artificial neural network involves iteratively adapting its parameters so as to minimize the error of the network's prediction, when confronted with a learning task. This iterative change can be naturally interpreted as a trajectory in network space -- a time series of networks -- and thus the training algorithm (e.g. gradient descent optimization of a suitable loss function) can be interpreted as a dynamical system in graph space. In order to illustrate this interpretation, here we study the dynamical properties of this process by analyzing through this lens the network trajectories of a shallow neural network, and its evolution through learning a simple classification task. We systematically consider different ranges of the learning rate and explore both the dynamical and orbital stability of the resulting network trajectories, finding hints of regular and chaotic behavior depending on the learning rate regime. Our findings are put in contrast to common wisdom on convergence properties of neural networks and dynamical systems theory. This work also contributes to the cross-fertilization of ideas between dynamical systems theory, network theory and machine learning
△ Less
Submitted 8 April, 2024;
originally announced April 2024.
-
Learn one size to infer all: Exploiting translational symmetries in delay-dynamical and spatio-temporal systems using scalable neural networks
Authors:
Mirko Goldmann,
Claudio R. Mirasso,
Ingo Fischer,
Miguel C. Soriano
Abstract:
We design scalable neural networks adapted to translational symmetries in dynamical systems, capable of inferring untrained high-dimensional dynamics for different system sizes. We train these networks to predict the dynamics of delay-dynamical and spatio-temporal systems for a single size. Then, we drive the networks by their own predictions. We demonstrate that by scaling the size of the trained…
▽ More
We design scalable neural networks adapted to translational symmetries in dynamical systems, capable of inferring untrained high-dimensional dynamics for different system sizes. We train these networks to predict the dynamics of delay-dynamical and spatio-temporal systems for a single size. Then, we drive the networks by their own predictions. We demonstrate that by scaling the size of the trained network, we can predict the complex dynamics for larger or smaller system sizes. Thus, the network learns from a single example and, by exploiting symmetry properties, infers entire bifurcation diagrams.
△ Less
Submitted 5 July, 2024; v1 submitted 5 November, 2021;
originally announced November 2021.
-
Nonlinear photonic dynamical systems for unconventional computing
Authors:
Daniel Brunner,
Laurent Larger,
Miguel C. Soriano
Abstract:
Driven by the remarkable breakthroughs during the past decade, photonics neural networks have experienced a revival. Here, we provide a general overview of progress over the past decade, and sketch a roadmap of important future developments. We focus on photonic implementations of the reservoir computing machine learning paradigm, which offers a conceptually simple approach that is amenable to har…
▽ More
Driven by the remarkable breakthroughs during the past decade, photonics neural networks have experienced a revival. Here, we provide a general overview of progress over the past decade, and sketch a roadmap of important future developments. We focus on photonic implementations of the reservoir computing machine learning paradigm, which offers a conceptually simple approach that is amenable to hardware implementations. In particular, we provide an overview of photonic reservoir computing implemented via either spatio temporal or delay dynamical systems. Going beyond reservoir computing, we discuss recent advances and future challenges of photonic implementations of deep neural networks, of the quest for learning methods that are hardware-friendly as well as realizing autonomous photonic neural networks, i.e. with minimal digital electronic auxiliary hardware.
△ Less
Submitted 19 July, 2021;
originally announced July 2021.
-
Unveiling the role of plasticity rules in reservoir computing
Authors:
Guillermo B. Morales,
Claudio R. Mirasso,
Miguel C. Soriano
Abstract:
Reservoir Computing (RC) is an appealing approach in Machine Learning that combines the high computational capabilities of Recurrent Neural Networks with a fast and easy training method. Likewise, successful implementation of neuro-inspired plasticity rules into RC artificial networks has boosted the performance of the original models. In this manuscript, we analyze the role that plasticity rules…
▽ More
Reservoir Computing (RC) is an appealing approach in Machine Learning that combines the high computational capabilities of Recurrent Neural Networks with a fast and easy training method. Likewise, successful implementation of neuro-inspired plasticity rules into RC artificial networks has boosted the performance of the original models. In this manuscript, we analyze the role that plasticity rules play on the changes that lead to a better performance of RC. To this end, we implement synaptic and non-synaptic plasticity rules in a paradigmatic example of RC model: the Echo State Network. Testing on nonlinear time series prediction tasks, we show evidence that improved performance in all plastic models are linked to a decrease of the pair-wise correlations in the reservoir, as well as a significant increase of individual neurons ability to separate similar inputs in their activity space. Here we provide new insights on this observed improvement through the study of different stages on the plastic learning. From the perspective of the reservoir dynamics, optimal performance is found to occur close to the so-called edge of instability. Our results also show that it is possible to combine different forms of plasticity (namely synaptic and non-synaptic rules) to further improve the performance on prediction tasks, obtaining better results than those achieved with single-plasticity models.
△ Less
Submitted 14 January, 2021;
originally announced January 2021.
-
Reservoir computing with a single time-delay autonomous Boolean node
Authors:
Nicholas D. Haynes,
Miguel C. Soriano,
David P. Rosin,
Ingo Fischer,
Daniel J. Gauthier
Abstract:
We demonstrate reservoir computing with a physical system using a single autonomous Boolean logic element with time-delay feedback. The system generates a chaotic transient with a window of consistency lasting between 30 and 300 ns, which we show is sufficient for reservoir computing. We then characterize the dependence of computational performance on system parameters to find the best operating p…
▽ More
We demonstrate reservoir computing with a physical system using a single autonomous Boolean logic element with time-delay feedback. The system generates a chaotic transient with a window of consistency lasting between 30 and 300 ns, which we show is sufficient for reservoir computing. We then characterize the dependence of computational performance on system parameters to find the best operating point of the reservoir. When the best parameters are chosen, the reservoir is able to classify short input patterns with performance that decreases over time. In particular, we show that four distinct input patterns can be classified for 70 ns, even though the inputs are only provided to the reservoir for 7.5 ns.
△ Less
Submitted 30 January, 2015; v1 submitted 4 November, 2014;
originally announced November 2014.
-
Assessing coupling dynamics from an ensemble of time series
Authors:
German Gomez-Herrero,
Wei Wu,
Kalle Rutanen,
Miguel C. Soriano,
Gordon Pipa,
Raul Vicente
Abstract:
Finding interdependency relations between (possibly multivariate) time series provides valuable knowledge about the processes that generate the signals. Information theory sets a natural framework for non-parametric measures of several classes of statistical dependencies. However, a reliable estimation from information-theoretic functionals is hampered when the dependency to be assessed is brief o…
▽ More
Finding interdependency relations between (possibly multivariate) time series provides valuable knowledge about the processes that generate the signals. Information theory sets a natural framework for non-parametric measures of several classes of statistical dependencies. However, a reliable estimation from information-theoretic functionals is hampered when the dependency to be assessed is brief or evolves in time. Here, we show that these limitations can be overcome when we have access to an ensemble of independent repetitions of the time series. In particular, we gear a data-efficient estimator of probability densities to make use of the full structure of trial-based measures. By doing so, we can obtain time-resolved estimates for a family of entropy combinations (including mutual information, transfer entropy, and their conditional counterparts) which are more accurate than the simple average of individual estimates over trials. We show with simulated and real data that the proposed approach allows to recover the time-resolved dynamics of the coupling between different subsystems.
△ Less
Submitted 3 August, 2010;
originally announced August 2010.