Search | arXiv e-print repository

doi 10.1016/j.amc.2023.128253

Hopfield model with planted patterns: a teacher-student self-supervised learning model

Authors: Francesco Alemanno, Luca Camanzi, Gianluca Manzan, Daniele Tantari

Abstract: While Hopfield networks are known as paradigmatic models for memory storage and retrieval, modern artificial intelligence systems mainly stand on the machine learning paradigm. We show that it is possible to formulate a teacher-student self-supervised learning problem with Boltzmann machines in terms of a suitable generalization of the Hopfield model with structured patterns, where the spin variab… ▽ More While Hopfield networks are known as paradigmatic models for memory storage and retrieval, modern artificial intelligence systems mainly stand on the machine learning paradigm. We show that it is possible to formulate a teacher-student self-supervised learning problem with Boltzmann machines in terms of a suitable generalization of the Hopfield model with structured patterns, where the spin variables are the machine weights and patterns correspond to the training set's examples. We analyze the learning performance by studying the phase diagram in terms of the training set size, the dataset noise and the inference temperature (i.e. the weight regularization). With a small but informative dataset the machine can learn by memorization. With a noisy dataset, an extensive number of examples above a critical threshold is needed. In this regime the memory storage limits of the system becomes an opportunity for the occurrence of a learning regime in which the system can generalize. △ Less

Submitted 31 December, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

Comments: 27 pages, 5 figures, typo in the free energy corrected

Journal ref: Applied Mathematics and Computation, 2023, 458, 128253

arXiv:2111.12997 [pdf, other]

doi 10.1007/s10955-022-02966-8

Replica symmetry breaking in dense neural networks

Authors: Linda Albanese, Francesco Alemanno, Andrea Alessandrelli, Adriano Barra

Abstract: Understanding the glassy nature of neural networks is pivotal both for theoretical and computational advances in Machine Learning and Theoretical Artificial Intelligence. Keeping the focus on dense associative Hebbian neural networks, the purpose of this paper is two-fold: at first we develop rigorous mathematical approaches to address properly a statistical mechanical picture of the phenomenon of… ▽ More Understanding the glassy nature of neural networks is pivotal both for theoretical and computational advances in Machine Learning and Theoretical Artificial Intelligence. Keeping the focus on dense associative Hebbian neural networks, the purpose of this paper is two-fold: at first we develop rigorous mathematical approaches to address properly a statistical mechanical picture of the phenomenon of {\em replica symmetry breaking} (RSB) in these networks, then -- deepening results stemmed via these routes -- we aim to inspect the {\em glassiness} that they hide. In particular, regarding the methodology, we provide two techniques: the former is an adaptation of the transport PDE to the case, while the latter is an extension of Guerra's interpolation breakthrough. Beyond coherence among the results, either in replica symmetric and in the one-step replica symmetry breaking level of description, we prove the Gardner's picture and we identify the maximal storage capacity by a ground-state analysis in the Baldi-Venkatesh high-storage regime. In the second part of the paper we investigate the glassy structure of these networks: in contrast with the replica symmetric scenario (RS), RSB actually stabilizes the spin-glass phase. We report huge differences w.r.t. the standard pairwise Hopfield limit: in particular, it is known that it is possible to express the free energy of the Hopfield neural network as a linear combination of the free energies of an hard spin glass (i.e. the Sherrington-Kirkpatrick model) and a soft spin glass (the Gaussian or "spherical" model). This is no longer true when interactions are more than pairwise (whatever the level of description, RS or RSB): for dense networks solely the free energy of the hard spin glass survives, proving a huge diversity in the underlying glassiness of associative neural networks. △ Less

Submitted 25 November, 2021; originally announced November 2021.

arXiv:2106.08978 [pdf, other]

Pattern recognition in Deep Boltzmann machines

Authors: Elena Agliari, Linda Albanese, Francesco Alemanno, Alberto Fachechi

Abstract: We consider a multi-layer Sherrington-Kirkpatrick spin-glass as a model for deep restricted Boltzmann machines and we solve for its quenched free energy, in the thermodynamic limit and allowing for a first step of replica symmetry breaking. This result is accomplished rigorously exploiting interpolating techniques and recovering the expression already known for the replica-symmetry case. Further,… ▽ More We consider a multi-layer Sherrington-Kirkpatrick spin-glass as a model for deep restricted Boltzmann machines and we solve for its quenched free energy, in the thermodynamic limit and allowing for a first step of replica symmetry breaking. This result is accomplished rigorously exploiting interpolating techniques and recovering the expression already known for the replica-symmetry case. Further, we drop the restriction constraint by introducing intra-layer connections among spins and we show that the resulting system can be mapped into a modular Hopfield network, which is also addressed rigorously via interpolating techniques up to the first step of replica symmetry breaking. △ Less

Submitted 16 June, 2021; originally announced June 2021.

Comments: 24 pages, 2 figures

Report number: Roma01.Math

Journal ref: Journal of Physics A: Mathematical and Theoretical, Volume 54, Number 50 (2021)

Showing 1–3 of 3 results for author: Alemanno, F