Skip to main content

Showing 1–34 of 34 results for author: Loureiro, B

Searching in archive stat. Search in all archives.
.
  1. arXiv:2506.12454  [pdf, ps, other

    stat.ML cond-mat.dis-nn cs.CR cs.LG

    On the existence of consistent adversarial attacks in high-dimensional linear classification

    Authors: Matteo Vilucchio, Lenka Zdeborová, Bruno Loureiro

    Abstract: What fundamentally distinguishes an adversarial attack from a misclassification due to limited model expressivity or finite data? In this work, we investigate this question in the setting of high-dimensional binary classification, where statistical effects due to limited data availability play a central role. We introduce a new error metric that precisely capture this distinction, quantifying mode… ▽ More

    Submitted 14 June, 2025; originally announced June 2025.

  2. arXiv:2506.02651  [pdf, ps, other

    stat.ML cond-mat.dis-nn cs.LG

    Asymptotics of SGD in Sequence-Single Index Models and Single-Layer Attention Networks

    Authors: Luca Arnaboldi, Bruno Loureiro, Ludovic Stephan, Florent Krzakala, Lenka Zdeborova

    Abstract: We study the dynamics of stochastic gradient descent (SGD) for a class of sequence models termed Sequence Single-Index (SSI) models, where the target depends on a single direction in input space applied to a sequence of tokens. This setting generalizes classical single-index models to the sequential domain, encompassing simplified one-layer attention architectures. We derive a closed-form expressi… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  3. arXiv:2410.18938  [pdf, other

    stat.ML cs.LG math.ST

    A Random Matrix Theory Perspective on the Spectrum of Learned Features and Asymptotic Generalization Capabilities

    Authors: Yatin Dandi, Luca Pesce, Hugo Cui, Florent Krzakala, Yue M. Lu, Bruno Loureiro

    Abstract: A key property of neural networks is their capacity of adapting to data during training. Yet, our current mathematical understanding of feature learning and its relationship to generalization remain limited. In this work, we provide a random matrix analysis of how fully-connected two-layer neural networks adapt to the target function after a single, but aggressive, gradient descent step. We rigoro… ▽ More

    Submitted 24 October, 2024; originally announced October 2024.

  4. arXiv:2410.16073  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG math.ST

    On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds

    Authors: Matteo Vilucchio, Nikolaos Tsilivis, Bruno Loureiro, Julia Kempe

    Abstract: Regularization, whether explicit in terms of a penalty in the loss or implicit in the choice of algorithm, is a cornerstone of modern machine learning. Indeed, controlling the complexity of the model class is particularly important when data is scarce, noisy or contaminated, as it translates a statistical belief on the underlying structure of the data. This work investigates the question of how to… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  5. arXiv:2410.13300  [pdf, other

    stat.ML cond-mat.dis-nn cond-mat.stat-mech cs.LG math.ST

    A theoretical perspective on mode collapse in variational inference

    Authors: Roman Soletskyi, Marylou Gabrié, Bruno Loureiro

    Abstract: While deep learning has expanded the possibilities for highly expressive variational families, the practical benefits of these tools for variational inference (VI) are often limited by the minimization of the traditional Kullback-Leibler objective, which can yield suboptimal solutions. A major challenge in this context is \emph{mode collapse}: the phenomenon where a model concentrates on a few mod… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

  6. arXiv:2406.02157  [pdf, other

    stat.ML cs.LG

    Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs

    Authors: Luca Arnaboldi, Yatin Dandi, Florent Krzakala, Bruno Loureiro, Luca Pesce, Ludovic Stephan

    Abstract: We study the impact of the batch size $n_b$ on the iteration time $T$ of training two-layer neural networks with one-pass stochastic gradient descent (SGD) on multi-index target functions of isotropic covariates. We characterize the optimal batch size minimizing the iteration time as a function of the hardness of the target, as characterized by the information exponents. We show that performing gr… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Journal ref: Proceedings of the 41st International Conference on Machine Learning, PMLR 235:1730-1762, 2024

  7. arXiv:2405.15699  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Dimension-free deterministic equivalents and scaling laws for random feature regression

    Authors: Leonardo Defilippis, Bruno Loureiro, Theodor Misiakiewicz

    Abstract: In this work we investigate the generalization performance of random feature ridge regression (RFRR). Our main contribution is a general deterministic equivalent for the test error of RFRR. Specifically, under a certain concentration property, we show that the test error is well approximated by a closed-form expression that only depends on the feature map eigenvalues. Notably, our approximation gu… ▽ More

    Submitted 5 November, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: NeurIPS 2024 camera-ready version

  8. arXiv:2402.13999  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG math.ST

    Asymptotics of Learning with Deep Structured (Random) Features

    Authors: Dominik Schröder, Daniil Dmitriev, Hugo Cui, Bruno Loureiro

    Abstract: For a large class of feature maps we provide a tight asymptotic characterisation of the test error associated with learning the readout layer, in the high-dimensional limit where the input dimension, hidden layer widths, and number of training samples are proportionally large. This characterization is formulated in terms of the population covariance of the features. Our work is partially motivated… ▽ More

    Submitted 10 June, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: ICML camera-ready version

  9. arXiv:2402.13622  [pdf, ps, other

    stat.ML cond-mat.dis-nn cs.LG

    Analysis of Bootstrap and Subsampling in High-dimensional Regularized Regression

    Authors: Lucas Clarté, Adrien Vandenbroucque, Guillaume Dalle, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

    Abstract: We investigate popular resampling methods for estimating the uncertainty of statistical models, such as subsampling, bootstrap and the jackknife, and their performance in high-dimensional supervised regression tasks. We provide a tight asymptotic description of the biases and variances estimated by these methods in the context of generalized linear models, such as ridge and logistic regression, ta… ▽ More

    Submitted 1 November, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Journal ref: Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:787-819, 2024

  10. arXiv:2402.05674  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    A High Dimensional Statistical Model for Adversarial Training: Geometry and Trade-Offs

    Authors: Kasimir Tanner, Matteo Vilucchio, Bruno Loureiro, Florent Krzakala

    Abstract: This work investigates adversarial training in the context of margin-based linear classifiers in the high-dimensional regime where the dimension $d$ and the number of data points $n$ diverge with a fixed ratio $α= n / d$. We introduce a tractable mathematical model where the interplay between the data and adversarial attacker geometries can be studied, while capturing the core phenomenology observ… ▽ More

    Submitted 27 December, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  11. arXiv:2402.04980  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Asymptotics of feature learning in two-layer networks after one gradient-step

    Authors: Hugo Cui, Luca Pesce, Yatin Dandi, Florent Krzakala, Yue M. Lu, Lenka Zdeborová, Bruno Loureiro

    Abstract: In this manuscript, we investigate the problem of how two-layer neural networks learn features from data, and improve over the kernel regime, after being trained with a single gradient descent step. Leveraging the insight from (Ba et al., 2022), we model the trained network by a spiked Random Features (sRF) model. Further building on recent progress on Gaussian universality (Dandi et al., 2023), w… ▽ More

    Submitted 4 June, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Journal ref: Proceedings of the 41st International Conference on Machine Learning, PMLR 235:9662-9695, 2024

  12. arXiv:2309.16476  [pdf, other

    math.ST cond-mat.dis-nn cs.LG stat.ML

    High-dimensional robust regression under heavy-tailed data: Asymptotics and Universality

    Authors: Urte Adomaityte, Leonardo Defilippis, Bruno Loureiro, Gabriele Sicuro

    Abstract: We investigate the high-dimensional properties of robust regression estimators in the presence of heavy-tailed contamination of both the covariates and response functions. In particular, we provide a sharp asymptotic characterisation of M-estimators trained on a family of elliptical covariate and noise data distributions including cases where second and higher moments do not exist. We show that, d… ▽ More

    Submitted 31 May, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: 13 pages + Supplementary information

  13. arXiv:2305.18502  [pdf, other

    stat.ML cs.LG

    Escaping mediocrity: how two-layer networks learn hard generalized linear models with SGD

    Authors: Luca Arnaboldi, Florent Krzakala, Bruno Loureiro, Ludovic Stephan

    Abstract: This study explores the sample complexity for two-layer neural networks to learn a generalized linear target function under Stochastic Gradient Descent (SGD), focusing on the challenging regime where many flat directions are present at initialization. It is well-established that in this scenario $n=O(d \log d)$ samples are typically needed. However, we provide precise results concerning the pre-fa… ▽ More

    Submitted 1 March, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

  14. arXiv:2305.18270  [pdf, other

    stat.ML cs.LG

    How Two-Layer Neural Networks Learn, One (Giant) Step at a Time

    Authors: Yatin Dandi, Florent Krzakala, Bruno Loureiro, Luca Pesce, Ludovic Stephan

    Abstract: For high-dimensional Gaussian data, we investigate theoretically how the features of a two-layer neural network adapt to the structure of the target function through a few large batch gradient descent steps, leading to an improvement in the approximation capacity from initialization. First, we compare the influence of batch size to that of multiple steps. For a single step, a batch of size… ▽ More

    Submitted 3 June, 2025; v1 submitted 29 May, 2023; originally announced May 2023.

    Journal ref: Journal of Machine Learning Research 25 (2004) 1-65

  15. arXiv:2303.02644  [pdf, other

    cs.LG stat.ML

    Expectation consistency for calibration of neural networks

    Authors: Lucas Clarté, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

    Abstract: Despite their incredible performance, it is well reported that deep neural networks tend to be overoptimistic about their prediction confidence. Finding effective and efficient calibration methods for neural networks is therefore an important endeavour towards better uncertainty quantification in deep learning. In this manuscript, we introduce a novel calibration technique named expectation consis… ▽ More

    Submitted 4 August, 2023; v1 submitted 5 March, 2023; originally announced March 2023.

    Journal ref: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:443-453, 2023

  16. arXiv:2302.08933  [pdf, other

    math.ST stat.ML

    Universality laws for Gaussian mixtures in generalized linear models

    Authors: Yatin Dandi, Ludovic Stephan, Florent Krzakala, Bruno Loureiro, Lenka Zdeborová

    Abstract: Let $(x_{i}, y_{i})_{i=1,\dots,n}$ denote independent samples from a general mixture distribution $\sum_{c\in\mathcal{C}}ρ_{c}P_{c}^{x}$, and consider the hypothesis class of generalized linear models $\hat{y} = F(Θ^{\top}x)$. In this work, we investigate the asymptotic joint statistics of the family of generalized linear estimators $(Θ_{1}, \dots, Θ_{M})$ obtained either from (a) minimizing an em… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Journal ref: Advances in Neural Information Processing Systems 36 (2023)

  17. arXiv:2302.08923  [pdf, other

    math.ST cond-mat.dis-nn cs.LG stat.ML

    Are Gaussian data all you need? Extents and limits of universality in high-dimensional generalized linear estimation

    Authors: Luca Pesce, Florent Krzakala, Bruno Loureiro, Ludovic Stephan

    Abstract: In this manuscript we consider the problem of generalized linear estimation on Gaussian mixture data with labels given by a single-index model. Our first result is a sharp asymptotic expression for the test and training errors in the high-dimensional regime. Motivated by the recent stream of results on the Gaussian universality of the test and training errors in generalized linear estimation, we a… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

  18. arXiv:2302.05882  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks

    Authors: Luca Arnaboldi, Ludovic Stephan, Florent Krzakala, Bruno Loureiro

    Abstract: This manuscript investigates the one-pass stochastic gradient descent (SGD) dynamics of a two-layer neural network trained on Gaussian data and labels generated by a similar, though not necessarily identical, target function. We rigorously analyse the limiting dynamics via a deterministic and low-dimensional description in terms of the sufficient statistics for the population risk. Our unifying an… ▽ More

    Submitted 12 February, 2023; originally announced February 2023.

  19. arXiv:2302.00401  [pdf, other

    stat.ML cs.LG

    Deterministic equivalent and error universality of deep random features learning

    Authors: Dominik Schröder, Hugo Cui, Daniil Dmitriev, Bruno Loureiro

    Abstract: This manuscript considers the problem of learning a random Gaussian network function using a fully connected network with frozen intermediate layers and trainable readout layer. This problem can be seen as a natural generalization of the widely studied random features model to deeper architectures. First, we prove Gaussian universality of the test error in a ridge regression setting where the lear… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

  20. arXiv:2205.13527  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG math.PR math.ST

    Subspace clustering in high-dimensions: Phase transitions & Statistical-to-Computational gap

    Authors: Luca Pesce, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

    Abstract: A simple model to study subspace clustering is the high-dimensional $k$-Gaussian mixture model where the cluster means are sparse vectors. Here we provide an exact asymptotic characterization of the statistically optimal reconstruction error in this model in the high-dimensional regime with extensive sparsity, i.e. when the fraction of non-zero components of the cluster means $ρ$, as well as the r… ▽ More

    Submitted 1 December, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: NeurIPS camera-ready version

    Journal ref: Advances in Neural Information Processing Systems (2022), vol 35, pages 27087--27099

  21. arXiv:2205.13303  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG math.PR math.ST

    Gaussian Universality of Perceptrons with Random Labels

    Authors: Federica Gerace, Florent Krzakala, Bruno Loureiro, Ludovic Stephan, Lenka Zdeborová

    Abstract: While classical in many theoretical settings - and in particular in statistical physics-inspired works - the assumption of Gaussian i.i.d. input data is often perceived as a strong limitation in the context of statistics and machine learning. In this study, we redeem this line of work in the case of generalized linear classification, a.k.a. the perceptron model, with random labels. We argue that t… ▽ More

    Submitted 2 March, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

    Journal ref: Physical Review E 109.3 (2024): 034305

  22. arXiv:2203.12094  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Learning curves for the multi-class teacher-student perceptron

    Authors: Elisabetta Cornacchia, Francesca Mignacco, Rodrigo Veiga, Cédric Gerbelot, Bruno Loureiro, Lenka Zdeborová

    Abstract: One of the most classical results in high-dimensional learning theory provides a closed-form expression for the generalisation error of binary classification with the single-layer teacher-student perceptron on i.i.d. Gaussian inputs. Both Bayes-optimal estimation and empirical risk minimisation (ERM) were extensively analysed for this setting. At the same time, a considerable part of modern machin… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: 14 pages + appendix

    Journal ref: Machine Learning: Science and Technology 4 015019 (2022)

  23. arXiv:2202.03295  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    Theoretical characterization of uncertainty in high-dimensional linear classification

    Authors: Lucas Clarté, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

    Abstract: Being able to reliably assess not only the \emph{accuracy} but also the \emph{uncertainty} of models' predictions is an important endeavour in modern machine learning. Even if the model generating the data and labels is known, computing the intrinsic uncertainty after learning the model from a limited number of samples amounts to sampling the corresponding posterior probability measure. Such sampl… ▽ More

    Submitted 14 November, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Journal ref: Mach. Learn.: Sci. Technol. 4 025029 (2023)

  24. arXiv:2202.00293  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

    Authors: Rodrigo Veiga, Ludovic Stephan, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

    Abstract: Despite the non-convex optimization landscape, over-parametrized shallow networks are able to achieve global convergence under gradient descent. The picture can be radically different for narrow networks, which tend to get stuck in badly-generalizing local minima. Here we investigate the cross-over between these two regimes in the high-dimensional setting, and in particular investigate the connect… ▽ More

    Submitted 14 June, 2023; v1 submitted 1 February, 2022; originally announced February 2022.

    Comments: 20 pages

    Journal ref: Advances in Neural Information Processing Systems (2022), vol 35, pages {23244--23255)

  25. arXiv:2201.13383  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension

    Authors: Bruno Loureiro, Cédric Gerbelot, Maria Refinetti, Gabriele Sicuro, Florent Krzakala

    Abstract: From the sampling of data to the initialisation of parameters, randomness is ubiquitous in modern Machine Learning practice. Understanding the statistical fluctuations engendered by the different sources of randomness in prediction is therefore key to understanding robust generalisation. In this manuscript we develop a quantitative and rigorous theory for the study of fluctuations in an ensemble o… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

    Comments: 17 pages + Appendix

    Journal ref: Proceedings of the 39th International Conference on Machine Learning (ICML). PMLR 162:14283-14314, 2022

  26. Error Scaling Laws for Kernel Classification under Source and Capacity Conditions

    Authors: Hugo Cui, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

    Abstract: We consider the problem of kernel classification. While worst-case bounds on the decay rate of the prediction error with the number of samples are known for some classifiers, they often fail to accurately describe the learning curves of real data sets. In this work, we consider the important class of data sets satisfying the standard source and capacity conditions, comprising a number of real data… ▽ More

    Submitted 6 September, 2023; v1 submitted 29 January, 2022; originally announced January 2022.

    Journal ref: Mach. Learn.: Sci. Technol. (2023) 4 035033

  27. arXiv:2201.09986  [pdf, ps, other

    cs.IT cs.CR cs.LG stat.ML

    Bayesian Inference with Nonlinear Generative Models: Comments on Secure Learning

    Authors: Ali Bereyhi, Bruno Loureiro, Florent Krzakala, Ralf R. Müller, Hermann Schulz-Baldes

    Abstract: Unlike the classical linear model, nonlinear generative models have been addressed sparsely in the literature of statistical learning. This work aims to bringing attention to these models and their secrecy potential. To this end, we invoke the replica method to derive the asymptotic normalized cross entropy in an inverse probability problem whose generative model is described by a Gaussian random… ▽ More

    Submitted 13 July, 2022; v1 submitted 19 January, 2022; originally announced January 2022.

    Comments: 72 pages, 14 figures

  28. arXiv:2106.03791  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Learning Gaussian Mixtures with Generalised Linear Models: Precise Asymptotics in High-dimensions

    Authors: Bruno Loureiro, Gabriele Sicuro, Cédric Gerbelot, Alessandro Pacco, Florent Krzakala, Lenka Zdeborová

    Abstract: Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks. In this manuscript, we characterise the learning of a mixture of $K$ Gaussians with generic means and covariances via empirical risk minimisation (ERM) with any convex loss and regularisation. In particular, we prove exact asymptotics characterising the ERM… ▽ More

    Submitted 14 December, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: 12 pages + 34 pages of Appendix, 10 figures

    Journal ref: Advances in Neural Information Processing Systems 34 (2021): 10144-10157

  29. arXiv:2105.15004  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Generalization Error Rates in Kernel Regression: The Crossover from the Noiseless to Noisy Regime

    Authors: Hugo Cui, Bruno Loureiro, Florent Krzakala, Lenka Zdeborová

    Abstract: In this manuscript we consider Kernel Ridge Regression (KRR) under the Gaussian design. Exponents for the decay of the excess generalization error of KRR have been reported in various works under the assumption of power-law decay of eigenvalues of the features co-variance. These decays were, however, provided for sizeably different setups, namely in the noiseless case with constant regularization… ▽ More

    Submitted 15 December, 2021; v1 submitted 31 May, 2021; originally announced May 2021.

    Comments: 22 pages, 10 figures, 2 tables

    Journal ref: 35th Conference on Neural Information Processing Systems (NeurIPS 2021) vol 34 p10131--10143. J. Stat. Mech. (2022) 114004

  30. arXiv:2102.08127  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG math.PR math.ST

    Learning curves of generic features maps for realistic datasets with a teacher-student model

    Authors: Bruno Loureiro, Cédric Gerbelot, Hugo Cui, Sebastian Goldt, Florent Krzakala, Marc Mézard, Lenka Zdeborová

    Abstract: Teacher-student models provide a framework in which the typical-case performance of high-dimensional supervised learning can be described in closed form. The assumptions of Gaussian i.i.d. input data underlying the canonical teacher-student model may, however, be perceived as too restrictive to capture the behaviour of realistic data sets. In this paper, we introduce a Gaussian covariate generalis… ▽ More

    Submitted 14 December, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: v3: NeurIPS camera-ready

    Journal ref: 35th Conference on Neural Information Processing Systems (NeurIPS 2021), vol 34 p10137--18151. J. Stat. Mech. (2022) 114001

  31. arXiv:2006.14709  [pdf, other

    stat.ML cond-mat.dis-nn cond-mat.stat-mech cs.LG

    The Gaussian equivalence of generative models for learning with shallow neural networks

    Authors: Sebastian Goldt, Bruno Loureiro, Galen Reeves, Florent Krzakala, Marc Mézard, Lenka Zdeborová

    Abstract: Understanding the impact of data structure on the computational tractability of learning is a key challenge for the theory of neural networks. Many theoretical works do not explicitly model training data, or assume that inputs are drawn component-wise independently from some simple probability distribution. Here, we go beyond this simple paradigm by studying the performance of neural networks trai… ▽ More

    Submitted 21 May, 2021; v1 submitted 25 June, 2020; originally announced June 2020.

    Comments: The accompanying code for this paper is available at https://github.com/sgoldt/gaussian-equiv-2layer

    Journal ref: Proceedings of the 2nd Mathematical and Scientific Machine Learning Conference, PMLR 145:426-471 (2021)

  32. arXiv:2002.09339  [pdf, other

    math.ST cs.LG math.PR stat.ML

    Generalisation error in learning with random features and the hidden manifold model

    Authors: Federica Gerace, Bruno Loureiro, Florent Krzakala, Marc Mézard, Lenka Zdeborová

    Abstract: We study generalised linear regression and classification for a synthetically generated dataset encompassing different problems of interest, such as learning with random features, neural networks in the lazy training regime, and the hidden manifold model. We consider the high-dimensional regime and using the replica method from statistical physics, we provide a closed-form expression for the asymp… ▽ More

    Submitted 20 August, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

    Comments: v2: ICML 2020 camera-ready

    Journal ref: J. Stat. Mech. 2021 124013 & ICML 2020

  33. arXiv:1912.02008  [pdf, other

    math.ST cond-mat.dis-nn cs.LG eess.SP stat.ML

    Exact asymptotics for phase retrieval and compressed sensing with random generative priors

    Authors: Benjamin Aubin, Bruno Loureiro, Antoine Baker, Florent Krzakala, Lenka Zdeborová

    Abstract: We consider the problem of compressed sensing and of (real-valued) phase retrieval with random measurement matrix. We derive sharp asymptotics for the information-theoretically optimal performance and for the best known polynomial algorithm for an ensemble of generative priors consisting of fully connected deep neural networks with random weight matrices and arbitrary activations. We compare the p… ▽ More

    Submitted 12 June, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

    Comments: 13+3 pages, 7 figures, v2 revised and accepted at MSML

    Journal ref: Proceedings of The First Mathematical and Scientific Machine Learning Conference, PMLR 107:55-73, 2020

  34. arXiv:1905.12385  [pdf, other

    math.ST cs.LG eess.SP math.PR stat.ML

    The spiked matrix model with generative priors

    Authors: Benjamin Aubin, Bruno Loureiro, Antoine Maillard, Florent Krzakala, Lenka Zdeborová

    Abstract: Using a low-dimensional parametrization of signals is a generic and powerful way to enhance performance in signal processing and statistical inference. A very popular and widely explored type of dimensionality reduction is sparsity; another type is generative modelling of signal distributions. Generative models based on neural networks, such as GANs or variational auto-encoders, are particularly p… ▽ More

    Submitted 30 May, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: 12 + 56, 8 figures, v2 lighter jpeg figures

    Journal ref: Advances in Neural Information Processing Systems, pp. 8364-8375. 2019