Skip to main content

Showing 1–13 of 13 results for author: Gerbelot, C

Searching in archive stat. Search in all archives.
.
  1. arXiv:2502.20003  [pdf, other

    stat.ML cs.LG

    Asymptotics of Non-Convex Generalized Linear Models in High-Dimensions: A proof of the replica formula

    Authors: Matteo Vilucchio, Yatin Dandi, Cedric Gerbelot, Florent Krzakala

    Abstract: The analytic characterization of the high-dimensional behavior of optimization for Generalized Linear Models (GLMs) with Gaussian data has been a central focus in statistics and probability in recent years. While convex cases, such as the LASSO, ridge regression, and logistic regression, have been extensively studied using a variety of techniques, the non-convex case remains far less understood de… ▽ More

    Submitted 27 February, 2025; originally announced February 2025.

  2. arXiv:2412.14650  [pdf, other

    math.PR cs.LG stat.ML

    Permutation recovery of spikes in noisy high-dimensional tensor estimation

    Authors: Gérard Ben Arous, Cédric Gerbelot, Vanessa Piccolo

    Abstract: We study the dynamics of gradient flow in high dimensions for the multi-spiked tensor problem, where the goal is to estimate $r$ unknown signal vectors (spikes) from noisy Gaussian tensor observations. Specifically, we analyze the maximum likelihood estimation procedure, which involves optimizing a highly nonconvex random function. We determine the sample complexity required for gradient flow to e… ▽ More

    Submitted 20 December, 2024; v1 submitted 19 December, 2024; originally announced December 2024.

    Comments: 29 pages, 2 figures. arXiv admin note: substantial text overlap with arXiv:2408.06401

    MSC Class: 68Q87; 62F30; 60H30

  3. arXiv:2410.18162  [pdf, other

    stat.ML cs.LG math.PR math.ST

    Stochastic gradient descent in high dimensions for multi-spiked tensor PCA

    Authors: Gérard Ben Arous, Cédric Gerbelot, Vanessa Piccolo

    Abstract: We study the dynamics in high dimensions of online stochastic gradient descent for the multi-spiked tensor model. This multi-index model arises from the tensor principal component analysis (PCA) problem with multiple spikes, where the goal is to estimate $r$ unknown signal vectors within the $N$-dimensional unit sphere through maximum likelihood estimation from noisy observations of a $p$-tensor.… ▽ More

    Submitted 23 October, 2024; originally announced October 2024.

    Comments: 58 pages, 10 figures. This is part of our manuscript arXiv:2408.06401

    MSC Class: 68Q87; 62F30; 60G42

  4. arXiv:2408.06401  [pdf, ps, other

    stat.ML cs.LG math.PR math.ST

    Langevin dynamics for high-dimensional optimization: the case of multi-spiked tensor PCA

    Authors: Gérard Ben Arous, Cédric Gerbelot, Vanessa Piccolo

    Abstract: We study nonconvex optimization in high dimensions through Langevin dynamics, focusing on the multi-spiked tensor PCA problem. This tensor estimation problem involves recovering $r$ hidden signal vectors (spikes) from noisy Gaussian tensor observations using maximum likelihood estimation. We study the number of samples required for Langevin dynamics to efficiently recover the spikes and determine… ▽ More

    Submitted 19 December, 2024; v1 submitted 12 August, 2024; originally announced August 2024.

    Comments: 65 pages

    MSC Class: 68Q87; 62F30; 60G44; 60H30

  5. arXiv:2311.15404  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    Applying statistical learning theory to deep learning

    Authors: Cédric Gerbelot, Avetik Karagulyan, Stefani Karp, Kavya Ravichandran, Menachem Stern, Nathan Srebro

    Abstract: Although statistical learning theory provides a robust framework to understand supervised learning, many theoretical aspects of deep learning remain unclear, in particular how different architectures may lead to inductive bias when trained using gradient based methods. The goal of these lectures is to provide an overview of some of the main questions that arise when attempting to understand deep l… ▽ More

    Submitted 25 March, 2024; v1 submitted 26 November, 2023; originally announced November 2023.

    Comments: 66 pages, 20 figures

  6. arXiv:2210.06591  [pdf, other

    math-ph cs.IT cs.LG stat.ML

    Rigorous dynamical mean field theory for stochastic gradient descent methods

    Authors: Cedric Gerbelot, Emanuele Troiani, Francesca Mignacco, Florent Krzakala, Lenka Zdeborova

    Abstract: We prove closed-form equations for the exact high-dimensional asymptotics of a family of first order gradient-based methods, learning an estimator (e.g. M-estimator, shallow neural network, ...) from observations on Gaussian data with empirical risk minimization. This includes widely used algorithms such as stochastic gradient descent (SGD) or Nesterov acceleration. The obtained equations match th… ▽ More

    Submitted 29 November, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: 40 pages, 4 figures

    Journal ref: SIAM Journal on Mathematics of Data Science 6.2 (2024): 400-427

  7. arXiv:2203.12094  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Learning curves for the multi-class teacher-student perceptron

    Authors: Elisabetta Cornacchia, Francesca Mignacco, Rodrigo Veiga, Cédric Gerbelot, Bruno Loureiro, Lenka Zdeborová

    Abstract: One of the most classical results in high-dimensional learning theory provides a closed-form expression for the generalisation error of binary classification with the single-layer teacher-student perceptron on i.i.d. Gaussian inputs. Both Bayes-optimal estimation and empirical risk minimisation (ERM) were extensively analysed for this setting. At the same time, a considerable part of modern machin… ▽ More

    Submitted 22 March, 2022; originally announced March 2022.

    Comments: 14 pages + appendix

    Journal ref: Machine Learning: Science and Technology 4 015019 (2022)

  8. arXiv:2201.13383  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension

    Authors: Bruno Loureiro, Cédric Gerbelot, Maria Refinetti, Gabriele Sicuro, Florent Krzakala

    Abstract: From the sampling of data to the initialisation of parameters, randomness is ubiquitous in modern Machine Learning practice. Understanding the statistical fluctuations engendered by the different sources of randomness in prediction is therefore key to understanding robust generalisation. In this manuscript we develop a quantitative and rigorous theory for the study of fluctuations in an ensemble o… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

    Comments: 17 pages + Appendix

    Journal ref: Proceedings of the 39th International Conference on Machine Learning (ICML). PMLR 162:14283-14314, 2022

  9. arXiv:2109.11905  [pdf, ps, other

    cs.IT math.PR math.ST stat.ML

    Graph-based Approximate Message Passing Iterations

    Authors: Cédric Gerbelot, Raphaël Berthier

    Abstract: Approximate-message passing (AMP) algorithms have become an important element of high-dimensional statistical inference, mostly due to their adaptability and concentration properties, the state evolution (SE) equations. This is demonstrated by the growing number of new iterations proposed for increasingly complex problems, ranging from multi-layer inference to low-rank matrix estimation with elabo… ▽ More

    Submitted 19 April, 2022; v1 submitted 24 September, 2021; originally announced September 2021.

    Comments: 59 pages, 24 main, 35 appendix

  10. arXiv:2106.03791  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Learning Gaussian Mixtures with Generalised Linear Models: Precise Asymptotics in High-dimensions

    Authors: Bruno Loureiro, Gabriele Sicuro, Cédric Gerbelot, Alessandro Pacco, Florent Krzakala, Lenka Zdeborová

    Abstract: Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks. In this manuscript, we characterise the learning of a mixture of $K$ Gaussians with generic means and covariances via empirical risk minimisation (ERM) with any convex loss and regularisation. In particular, we prove exact asymptotics characterising the ERM… ▽ More

    Submitted 14 December, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: 12 pages + 34 pages of Appendix, 10 figures

    Journal ref: Advances in Neural Information Processing Systems 34 (2021): 10144-10157

  11. arXiv:2102.08127  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG math.PR math.ST

    Learning curves of generic features maps for realistic datasets with a teacher-student model

    Authors: Bruno Loureiro, Cédric Gerbelot, Hugo Cui, Sebastian Goldt, Florent Krzakala, Marc Mézard, Lenka Zdeborová

    Abstract: Teacher-student models provide a framework in which the typical-case performance of high-dimensional supervised learning can be described in closed form. The assumptions of Gaussian i.i.d. input data underlying the canonical teacher-student model may, however, be perceived as too restrictive to capture the behaviour of realistic data sets. In this paper, we introduce a Gaussian covariate generalis… ▽ More

    Submitted 14 December, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: v3: NeurIPS camera-ready

    Journal ref: 35th Conference on Neural Information Processing Systems (NeurIPS 2021), vol 34 p10137--18151. J. Stat. Mech. (2022) 114001

  12. arXiv:2006.06581  [pdf, other

    stat.ML cond-mat.dis-nn cs.IT cs.LG math.PR

    Asymptotic Errors for Teacher-Student Convex Generalized Linear Models (or : How to Prove Kabashima's Replica Formula)

    Authors: Cedric Gerbelot, Alia Abbara, Florent Krzakala

    Abstract: There has been a recent surge of interest in the study of asymptotic reconstruction performance in various cases of generalized linear estimation problems in the teacher-student setting, especially for the case of i.i.d standard normal matrices. Here, we go beyond these matrices, and prove an analytical formula for the reconstruction performance of convex generalized linear models with rotationall… ▽ More

    Submitted 10 November, 2022; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: 53 pages, 4 figures

    Journal ref: IEEE Transactions on Information Theory, vol. 69, no. 3, pp. 1824-1852, March 2023

  13. arXiv:2002.04372  [pdf, other

    stat.ML cond-mat.dis-nn cond-mat.stat-mech

    Asymptotic errors for convex penalized linear regression beyond Gaussian matrices

    Authors: Cédric Gerbelot, Alia Abbara, Florent Krzakala

    Abstract: We consider the problem of learning a coefficient vector $x_{0}$ in $R^{N}$ from noisy linear observations $y=Fx_{0}+w$ in $R^{M}$ in the high dimensional limit $M,N$ to infinity with $α=M/N$ fixed. We provide a rigorous derivation of an explicit formula -- first conjectured using heuristic methods from statistical physics -- for the asymptotic mean squared error obtained by penalized convex regre… ▽ More

    Submitted 11 February, 2020; originally announced February 2020.

    Comments: 31 pages, 2 figures