Skip to main content

Showing 1–2 of 2 results for author: Skerk, R

.
  1. arXiv:2505.24849  [pdf, ps, other

    stat.ML cond-mat.dis-nn cond-mat.stat-mech cs.IT cs.LG

    Statistical mechanics of extensive-width Bayesian neural networks near interpolation

    Authors: Jean Barbier, Francesco Camilli, Minh-Toan Nguyen, Mauro Pastore, Rudy Skerk

    Abstract: For three decades statistical mechanics has been providing a framework to analyse neural networks. However, the theoretically tractable models, e.g., perceptrons, random features models and kernel machines, or multi-index models and committee machines with few neurons, remained simple compared to those used in applications. In this paper we help reducing the gap between practical networks and thei… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

    Comments: 9 pages + appendices, 12 figures. This submission supersedes arXiv:2501.18530

  2. arXiv:2501.18530  [pdf, other

    stat.ML cond-mat.dis-nn cond-mat.stat-mech cs.IT cs.LG

    Optimal generalisation and learning transition in extensive-width shallow neural networks near interpolation

    Authors: Jean Barbier, Francesco Camilli, Minh-Toan Nguyen, Mauro Pastore, Rudy Skerk

    Abstract: We consider a teacher-student model of supervised learning with a fully-trained two-layer neural network whose width $k$ and input dimension $d$ are large and proportional. We provide an effective theory for approximating the Bayes-optimal generalisation error of the network for any activation function in the regime of sample size $n$ scaling quadratically with the input dimension, i.e., around th… ▽ More

    Submitted 1 April, 2025; v1 submitted 30 January, 2025; originally announced January 2025.

    Comments: v2: 9 pages + appendix, 10 figures, 3 tables; added discussion on Gaussian inner weights (Fig. 2, 5 + Appendix H); added discussion on algorithmic complexity of specialisation (Appendix I and figures therein)