Skip to main content

Showing 1–26 of 26 results for author: Loureiro, B

Searching in archive cond-mat. Search in all archives.
.
  1. arXiv:2506.02651  [pdf, ps, other

    stat.ML cond-mat.dis-nn cs.LG

    Asymptotics of SGD in Sequence-Single Index Models and Single-Layer Attention Networks

    Authors: Luca Arnaboldi, Bruno Loureiro, Ludovic Stephan, Florent Krzakala, Lenka Zdeborova

    Abstract: We study the dynamics of stochastic gradient descent (SGD) for a class of sequence models termed Sequence Single-Index (SSI) models, where the target depends on a single direction in input space applied to a sequence of tokens. This setting generalizes classical single-index models to the sequential domain, encompassing simplified one-layer attention architectures. We derive a closed-form expressi… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  2. arXiv:2502.02545  [pdf, ps, other

    cs.LG cond-mat.dis-nn

    Optimal Spectral Transitions in High-Dimensional Multi-Index Models

    Authors: Leonardo Defilippis, Yatin Dandi, Pierre Mergny, Florent Krzakala, Bruno Loureiro

    Abstract: We consider the problem of how many samples from a Gaussian multi-index model are required to weakly reconstruct the relevant index subspace. Despite its increasing popularity as a testbed for investigating the computational complexity of neural networks, results beyond the single-index setting remain elusive. In this work, we introduce spectral algorithms based on the linearization of a message p… ▽ More

    Submitted 10 June, 2025; v1 submitted 4 February, 2025; originally announced February 2025.

  3. arXiv:2410.16073  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG math.ST

    On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds

    Authors: Matteo Vilucchio, Nikolaos Tsilivis, Bruno Loureiro, Julia Kempe

    Abstract: Regularization, whether explicit in terms of a penalty in the loss or implicit in the choice of algorithm, is a cornerstone of modern machine learning. Indeed, controlling the complexity of the model class is particularly important when data is scarce, noisy or contaminated, as it translates a statistical belief on the underlying structure of the data. This work investigates the question of how to… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  4. arXiv:2410.13300  [pdf, other

    stat.ML cond-mat.dis-nn cond-mat.stat-mech cs.LG math.ST

    A theoretical perspective on mode collapse in variational inference

    Authors: Roman Soletskyi, Marylou Gabrié, Bruno Loureiro

    Abstract: While deep learning has expanded the possibilities for highly expressive variational families, the practical benefits of these tools for variational inference (VI) are often limited by the minimization of the traditional Kullback-Leibler objective, which can yield suboptimal solutions. A major challenge in this context is \emph{mode collapse}: the phenomenon where a model concentrates on a few mod… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  5. arXiv:2405.15699  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Dimension-free deterministic equivalents and scaling laws for random feature regression

    Authors: Leonardo Defilippis, Bruno Loureiro, Theodor Misiakiewicz

    Abstract: In this work we investigate the generalization performance of random feature ridge regression (RFRR). Our main contribution is a general deterministic equivalent for the test error of RFRR. Specifically, under a certain concentration property, we show that the test error is well approximated by a closed-form expression that only depends on the feature map eigenvalues. Notably, our approximation gu… ▽ More

    Submitted 5 November, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: NeurIPS 2024 camera-ready version

  6. arXiv:2405.15480  [pdf, other

    cs.LG cond-mat.dis-nn cs.CC

    Fundamental computational limits of weak learnability in high-dimensional multi-index models

    Authors: Emanuele Troiani, Yatin Dandi, Leonardo Defilippis, Lenka Zdeborová, Bruno Loureiro, Florent Krzakala

    Abstract: Multi-index models - functions which only depend on the covariates through a non-linear transformation of their projection on a subspace - are a useful benchmark for investigating feature learning with neural nets. This paper examines the theoretical boundaries of efficient learnability in this hypothesis class, focusing on the minimum sample complexity required for weakly recovering their low-dim… ▽ More

    Submitted 2 April, 2025; v1 submitted 24 May, 2024; originally announced May 2024.

  7. arXiv:2402.13999  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG math.ST

    Asymptotics of Learning with Deep Structured (Random) Features

    Authors: Dominik Schröder, Daniil Dmitriev, Hugo Cui, Bruno Loureiro

    Abstract: For a large class of feature maps we provide a tight asymptotic characterisation of the test error associated with learning the readout layer, in the high-dimensional limit where the input dimension, hidden layer widths, and number of training samples are proportionally large. This characterization is formulated in terms of the population covariance of the features. Our work is partially motivated… ▽ More

    Submitted 10 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: ICML camera-ready version

  8. arXiv:2402.13622  [pdf, ps, other

    stat.ML cond-mat.dis-nn cs.LG

    Analysis of Bootstrap and Subsampling in High-dimensional Regularized Regression

    Authors: Lucas Clarté, Adrien Vandenbroucque, Guillaume Dalle, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

    Abstract: We investigate popular resampling methods for estimating the uncertainty of statistical models, such as subsampling, bootstrap and the jackknife, and their performance in high-dimensional supervised regression tasks. We provide a tight asymptotic description of the biases and variances estimated by these methods in the context of generalized linear models, such as ridge and logistic regression, ta… ▽ More

    Submitted 1 November, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Journal ref: Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:787-819, 2024

  9. arXiv:2402.05674  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    A High Dimensional Statistical Model for Adversarial Training: Geometry and Trade-Offs

    Authors: Kasimir Tanner, Matteo Vilucchio, Bruno Loureiro, Florent Krzakala

    Abstract: This work investigates adversarial training in the context of margin-based linear classifiers in the high-dimensional regime where the dimension $d$ and the number of data points $n$ diverge with a fixed ratio $α= n / d$. We introduce a tractable mathematical model where the interplay between the data and adversarial attacker geometries can be studied, while capturing the core phenomenology observ… ▽ More

    Submitted 27 December, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  10. arXiv:2402.04980  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Asymptotics of feature learning in two-layer networks after one gradient-step

    Authors: Hugo Cui, Luca Pesce, Yatin Dandi, Florent Krzakala, Yue M. Lu, Lenka Zdeborová, Bruno Loureiro

    Abstract: In this manuscript, we investigate the problem of how two-layer neural networks learn features from data, and improve over the kernel regime, after being trained with a single gradient descent step. Leveraging the insight from (Ba et al., 2022), we model the trained network by a spiked Random Features (sRF) model. Further building on recent progress on Gaussian universality (Dandi et al., 2023), w… ▽ More

    Submitted 4 June, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Journal ref: Proceedings of the 41st International Conference on Machine Learning, PMLR 235:9662-9695, 2024

  11. arXiv:2309.16476  [pdf, other

    math.ST cond-mat.dis-nn cs.LG stat.ML

    High-dimensional robust regression under heavy-tailed data: Asymptotics and Universality

    Authors: Urte Adomaityte, Leonardo Defilippis, Bruno Loureiro, Gabriele Sicuro

    Abstract: We investigate the high-dimensional properties of robust regression estimators in the presence of heavy-tailed contamination of both the covariates and response functions. In particular, we provide a sharp asymptotic characterisation of M-estimators trained on a family of elliptical covariate and noise data distributions including cases where second and higher moments do not exist. We show that, d… ▽ More

    Submitted 31 May, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: 13 pages + Supplementary information

  12. arXiv:2302.08923  [pdf, other

    math.ST cond-mat.dis-nn cs.LG stat.ML

    Are Gaussian data all you need? Extents and limits of universality in high-dimensional generalized linear estimation

    Authors: Luca Pesce, Florent Krzakala, Bruno Loureiro, Ludovic Stephan

    Abstract: In this manuscript we consider the problem of generalized linear estimation on Gaussian mixture data with labels given by a single-index model. Our first result is a sharp asymptotic expression for the test and training errors in the high-dimensional regime. Motivated by the recent stream of results on the Gaussian universality of the test and training errors in generalized linear estimation, we a… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

  13. arXiv:2302.05882  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks

    Authors: Luca Arnaboldi, Ludovic Stephan, Florent Krzakala, Bruno Loureiro

    Abstract: This manuscript investigates the one-pass stochastic gradient descent (SGD) dynamics of a two-layer neural network trained on Gaussian data and labels generated by a similar, though not necessarily identical, target function. We rigorously analyse the limiting dynamics via a deterministic and low-dimensional description in terms of the sufficient statistics for the population risk. Our unifying an… ▽ More

    Submitted 12 February, 2023; originally announced February 2023.

  14. arXiv:2205.13527  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG math.PR math.ST

    Subspace clustering in high-dimensions: Phase transitions & Statistical-to-Computational gap

    Authors: Luca Pesce, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

    Abstract: A simple model to study subspace clustering is the high-dimensional $k$-Gaussian mixture model where the cluster means are sparse vectors. Here we provide an exact asymptotic characterization of the statistically optimal reconstruction error in this model in the high-dimensional regime with extensive sparsity, i.e. when the fraction of non-zero components of the cluster means $ρ$, as well as the r… ▽ More

    Submitted 1 December, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: NeurIPS camera-ready version

    Journal ref: Advances in Neural Information Processing Systems (2022), vol 35, pages 27087--27099

  15. arXiv:2205.13303  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG math.PR math.ST

    Gaussian Universality of Perceptrons with Random Labels

    Authors: Federica Gerace, Florent Krzakala, Bruno Loureiro, Ludovic Stephan, Lenka Zdeborová

    Abstract: While classical in many theoretical settings - and in particular in statistical physics-inspired works - the assumption of Gaussian i.i.d. input data is often perceived as a strong limitation in the context of statistics and machine learning. In this study, we redeem this line of work in the case of generalized linear classification, a.k.a. the perceptron model, with random labels. We argue that t… ▽ More

    Submitted 2 March, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

    Journal ref: Physical Review E 109.3 (2024): 034305

  16. arXiv:2203.12094  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Learning curves for the multi-class teacher-student perceptron

    Authors: Elisabetta Cornacchia, Francesca Mignacco, Rodrigo Veiga, Cédric Gerbelot, Bruno Loureiro, Lenka Zdeborová

    Abstract: One of the most classical results in high-dimensional learning theory provides a closed-form expression for the generalisation error of binary classification with the single-layer teacher-student perceptron on i.i.d. Gaussian inputs. Both Bayes-optimal estimation and empirical risk minimisation (ERM) were extensively analysed for this setting. At the same time, a considerable part of modern machin… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: 14 pages + appendix

    Journal ref: Machine Learning: Science and Technology 4 015019 (2022)

  17. arXiv:2202.03295  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    Theoretical characterization of uncertainty in high-dimensional linear classification

    Authors: Lucas Clarté, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

    Abstract: Being able to reliably assess not only the \emph{accuracy} but also the \emph{uncertainty} of models' predictions is an important endeavour in modern machine learning. Even if the model generating the data and labels is known, computing the intrinsic uncertainty after learning the model from a limited number of samples amounts to sampling the corresponding posterior probability measure. Such sampl… ▽ More

    Submitted 14 November, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Journal ref: Mach. Learn.: Sci. Technol. 4 025029 (2023)

  18. arXiv:2202.00293  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

    Authors: Rodrigo Veiga, Ludovic Stephan, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

    Abstract: Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achieve global convergence under gradient descent. The picture can be radically different for narrow networks, which tend to get stuck in badly-generalizing local minima. Here we investigate the cross-over between these two regimes in the high-dimensional setting, and in particular investigate the connect… ▽ More

    Submitted 14 June, 2023; v1 submitted 1 February, 2022; originally announced February 2022.

    Comments: 20 pages

    Journal ref: Advances in Neural Information Processing Systems (2022), vol 35, pages {23244--23255)

  19. arXiv:2201.13383  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension

    Authors: Bruno Loureiro, Cédric Gerbelot, Maria Refinetti, Gabriele Sicuro, Florent Krzakala

    Abstract: From the sampling of data to the initialisation of parameters, randomness is ubiquitous in modern Machine Learning practice. Understanding the statistical fluctuations engendered by the different sources of randomness in prediction is therefore key to understanding robust generalisation. In this manuscript we develop a quantitative and rigorous theory for the study of fluctuations in an ensemble o… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

    Comments: 17 pages + Appendix

    Journal ref: Proceedings of the 39th International Conference on Machine Learning (ICML). PMLR 162:14283-14314, 2022

  20. arXiv:2106.03791  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Learning Gaussian Mixtures with Generalised Linear Models: Precise Asymptotics in High-dimensions

    Authors: Bruno Loureiro, Gabriele Sicuro, Cédric Gerbelot, Alessandro Pacco, Florent Krzakala, Lenka Zdeborová

    Abstract: Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks. In this manuscript, we characterise the learning of a mixture of $K$ Gaussians with generic means and covariances via empirical risk minimisation (ERM) with any convex loss and regularisation. In particular, we prove exact asymptotics characterising the ERM… ▽ More

    Submitted 14 December, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: 12 pages + 34 pages of Appendix, 10 figures

    Journal ref: Advances in Neural Information Processing Systems 34 (2021): 10144-10157

  21. arXiv:2105.15004  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Generalization Error Rates in Kernel Regression: The Crossover from the Noiseless to Noisy Regime

    Authors: Hugo Cui, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

    Abstract: In this manuscript we consider Kernel Ridge Regression (KRR) under the Gaussian design. Exponents for the decay of the excess generalization error of KRR have been reported in various works under the assumption of power-law decay of eigenvalues of the features co-variance. These decays were, however, provided for sizeably different setups, namely in the noiseless case with constant regularization… ▽ More

    Submitted 15 December, 2021; v1 submitted 31 May, 2021; originally announced May 2021.

    Comments: 22 pages, 10 figures, 2 tables

    Journal ref: 35th Conference on Neural Information Processing Systems (NeurIPS 2021) vol 34 p10131--10143. J. Stat. Mech. (2022) 114004

  22. arXiv:2102.08127  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG math.PR math.ST

    Learning curves of generic features maps for realistic datasets with a teacher-student model

    Authors: Bruno Loureiro, Cédric Gerbelot, Hugo Cui, Sebastian Goldt, Florent Krzakala, Marc Mézard, Lenka Zdeborová

    Abstract: Teacher-student models provide a framework in which the typical-case performance of high-dimensional supervised learning can be described in closed form. The assumptions of Gaussian i.i.d. input data underlying the canonical teacher-student model may, however, be perceived as too restrictive to capture the behaviour of realistic data sets. In this paper, we introduce a Gaussian covariate generalis… ▽ More

    Submitted 14 December, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: v3: NeurIPS camera-ready

    Journal ref: 35th Conference on Neural Information Processing Systems (NeurIPS 2021), vol 34 p10137--18151. J. Stat. Mech. (2022) 114001

  23. Reply to Comment on "Chaotic-Integrable Transition in the Sachdev-Ye-Kitaev Model"

    Authors: Antonio M. García-García, Bruno Loureiro, Aurelio Romero-Bermúdez, Masaki Tezuka

    Abstract: In a recent comment to the paper Chaotic Integrable transition in the SYK model, it was claimed that, in a certain region of parameters, the Lyapunov exponent of the N Majoranas Sachdev-Ye-Kitaev model with a quadratic perturbation, is always positive. This implies that the model is quantum chaotic. In this reply, we show that the employed perturbative formalism breaks down precisely in the range… ▽ More

    Submitted 12 March, 2021; v1 submitted 12 July, 2020; originally announced July 2020.

    Comments: v1: 2 pages and 3 figures. v2: corrected typos, published PRL version

    Journal ref: Phys. Rev. Lett. 126, 109102 - Published 11 March 2021

  24. arXiv:2006.14709  [pdf, other

    stat.ML cond-mat.dis-nn cond-mat.stat-mech cs.LG

    The Gaussian equivalence of generative models for learning with shallow neural networks

    Authors: Sebastian Goldt, Bruno Loureiro, Galen Reeves, Florent Krzakala, Marc Mézard, Lenka Zdeborová

    Abstract: Understanding the impact of data structure on the computational tractability of learning is a key challenge for the theory of neural networks. Many theoretical works do not explicitly model training data, or assume that inputs are drawn component-wise independently from some simple probability distribution. Here, we go beyond this simple paradigm by studying the performance of neural networks trai… ▽ More

    Submitted 21 May, 2021; v1 submitted 25 June, 2020; originally announced June 2020.

    Comments: The accompanying code for this paper is available at https://github.com/sgoldt/gaussian-equiv-2layer

    Journal ref: Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference, PMLR 145:426-471 (2021)

  25. arXiv:2006.05228  [pdf, other

    math.ST cond-mat.dis-nn cs.IT cs.LG math.PR

    Phase retrieval in high dimensions: Statistical and computational phase transitions

    Authors: Antoine Maillard, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

    Abstract: We consider the phase retrieval problem of reconstructing a $n$-dimensional real or complex signal $\mathbf{X}^{\star}$ from $m$ (possibly noisy) observations $Y_μ= | \sum_{i=1}^n Φ_{μi} X^{\star}_i/\sqrt{n}|$, for a large class of correlated real and complex random sensing matrices $\mathbfΦ$, in a high-dimensional setting where $m,n\to\infty$ while $α= m/n=Θ(1)$. First, we derive sharp asymptoti… ▽ More

    Submitted 23 October, 2020; v1 submitted 9 June, 2020; originally announced June 2020.

    Comments: 12 pages (main text and references), 26 pages of supplementary material. v2 matches the final version accepted at NeurIPS 2021

    Journal ref: Advances in Neural Information Processing Systems, v33, pages 11071--11082, 2020

  26. arXiv:1912.02008  [pdf, other

    math.ST cond-mat.dis-nn cs.LG eess.SP stat.ML

    Exact asymptotics for phase retrieval and compressed sensing with random generative priors

    Authors: Benjamin Aubin, Bruno Loureiro, Antoine Baker, Florent Krzakala, Lenka Zdeborová

    Abstract: We consider the problem of compressed sensing and of (real-valued) phase retrieval with random measurement matrix. We derive sharp asymptotics for the information-theoretically optimal performance and for the best known polynomial algorithm for an ensemble of generative priors consisting of fully connected deep neural networks with random weight matrices and arbitrary activations. We compare the p… ▽ More

    Submitted 12 June, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

    Comments: 13+3 pages, 7 figures, v2 revised and accepted at MSML

    Journal ref: Proceedings of The First Mathematical and Scientific Machine Learning Conference, PMLR 107:55-73, 2020