Skip to main content

Showing 1–17 of 17 results for author: Mondelli, M

Searching in archive math. Search in all archives.
.
  1. arXiv:2502.01583  [pdf, ps, other

    stat.ML cs.IT cs.LG math.PR math.ST

    Spectral Estimators for Multi-Index Models: Precise Asymptotics and Optimal Weak Recovery

    Authors: Filip Kovačević, Yihan Zhang, Marco Mondelli

    Abstract: Multi-index models provide a popular framework to investigate the learnability of functions with low-dimensional structure and, also due to their connections with neural networks, they have been object of recent intensive study. In this paper, we focus on recovering the subspace spanned by the signals via spectral estimators -- a family of methods routinely used in practice, often as a warm-start… ▽ More

    Submitted 10 June, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

    Comments: Accepted to COLT 2025

  2. arXiv:2410.04887  [pdf, other

    cs.LG math.OC stat.ML

    Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse

    Authors: Arthur Jacot, Peter Súkeník, Zihan Wang, Marco Mondelli

    Abstract: Deep neural networks (DNNs) at convergence consistently represent the training data in the last layer via a highly symmetric geometric structure referred to as neural collapse. This empirical evidence has spurred a line of theoretical research aimed at proving the emergence of neural collapse, mostly focusing on the unconstrained features model. Here, the features of the penultimate layer are free… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: 29 pages, 5 figures

  3. arXiv:2405.20993  [pdf, other

    cs.IT cond-mat.dis-nn cs.LG math.ST

    Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise

    Authors: Jean Barbier, Francesco Camilli, Marco Mondelli, Yizhou Xu

    Abstract: We consider a prototypical problem of Bayesian inference for a structured spiked model: a low-rank signal is corrupted by additive noise. While both information-theoretic and algorithmic limits are well understood when the noise is a Gaussian Wigner matrix, the more realistic case of structured noise still proves to be challenging. To capture the structure while maintaining mathematical tractabili… ▽ More

    Submitted 8 July, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

    MSC Class: 62F15; 82B44

  4. arXiv:2405.14468  [pdf, other

    cs.LG math.OC stat.ML

    Neural Collapse versus Low-rank Bias: Is Deep Neural Collapse Really Optimal?

    Authors: Peter Súkeník, Marco Mondelli, Christoph Lampert

    Abstract: Deep neural networks (DNNs) exhibit a surprising structure in their final layer known as neural collapse (NC), and a growing body of works has currently investigated the propagation of neural collapse to earlier layers of DNNs -- a phenomenon called deep neural collapse (DNC). However, existing theoretical results are restricted to special cases: linear models, only two layers or binary classifica… ▽ More

    Submitted 21 October, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  5. arXiv:2405.13912  [pdf, other

    math.ST cs.IT cs.LG math.PR stat.ML

    Matrix Denoising with Doubly Heteroscedastic Noise: Fundamental Limits and Optimal Spectral Methods

    Authors: Yihan Zhang, Marco Mondelli

    Abstract: We study the matrix denoising problem of estimating the singular vectors of a rank-$1$ signal corrupted by noise with both column and row correlations. Existing works are either unable to pinpoint the exact asymptotic estimation error or, when they do so, the resulting approaches (e.g., based on whitening or singular value shrinkage) remain vastly suboptimal. On top of this, most of the literature… ▽ More

    Submitted 28 October, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  6. arXiv:2402.11200  [pdf, other

    cs.IT math.FA math.PR

    Contraction of Markovian Operators in Orlicz Spaces and Error Bounds for Markov Chain Monte Carlo

    Authors: Amedeo Roberto Esposito, Marco Mondelli

    Abstract: We introduce a novel concept of convergence for Markovian processes within Orlicz spaces, extending beyond the conventional approach associated with $L_p$ spaces. After showing that Markovian operators are contractive in Orlicz spaces, our key technical contribution is an upper bound on their contraction coefficient, which admits a closed-form expression. The bound is tight in some settings, and i… ▽ More

    Submitted 11 June, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

    Comments: Full version of the work accepted for presentation at the Conference on Learning Theory (COLT) 2024

  7. arXiv:2308.14507  [pdf, other

    math.ST cs.IT cs.LG math.PR stat.ML

    Spectral Estimators for Structured Generalized Linear Models via Approximate Message Passing

    Authors: Yihan Zhang, Hong Chang Ji, Ramji Venkataramanan, Marco Mondelli

    Abstract: We consider the problem of parameter estimation in a high-dimensional generalized linear model. Spectral methods obtained via the principal eigenvector of a suitable data-dependent matrix provide a simple yet surprisingly effective solution. However, despite their wide use, a rigorous performance characterization, as well as a principled way to preprocess the data, are available only for unstructu… ▽ More

    Submitted 3 July, 2024; v1 submitted 28 August, 2023; originally announced August 2023.

  8. arXiv:2305.14164  [pdf, other

    cs.LG math.ST stat.ML

    Improved Convergence of Score-Based Diffusion Models via Prediction-Correction

    Authors: Francesco Pedrotti, Jan Maas, Marco Mondelli

    Abstract: Score-based generative models (SGMs) are powerful tools to sample from complex data distributions. Their underlying idea is to (i) run a forward process for time $T_1$ by adding noise to the data, (ii) estimate its score function, and (iii) use such estimate to run a reverse process. As the reverse process is initialized with the stationary distribution of the forward one, the existing analysis pa… ▽ More

    Submitted 4 June, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: 34 pages; accepted to TMLR

  9. arXiv:2303.07245  [pdf, ps, other

    cs.IT math.PR

    Concentration without Independence via Information Measures

    Authors: Amedeo Roberto Esposito, Marco Mondelli

    Abstract: We propose a novel approach to concentration for non-independent random variables. The main idea is to ``pretend'' that the random variables are independent and pay a multiplicative price measuring how far they are from actually being independent. This price is encapsulated in the Hellinger integral between the joint and the product of the marginals, which is then upper bounded leveraging tensoris… ▽ More

    Submitted 30 October, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

  10. arXiv:2302.03306  [pdf, other

    cs.IT cs.LG math.ST

    Mismatched estimation of non-symmetric rank-one matrices corrupted by structured noise

    Authors: Teng Fu, YuHao Liu, Jean Barbier, Marco Mondelli, ShanSuo Liang, TianQi Hou

    Abstract: We study the performance of a Bayesian statistician who estimates a rank-one signal corrupted by non-symmetric rotationally invariant noise with a generic distribution of singular values. As the signal-to-noise ratio and the noise structure are unknown, a Gaussian setup is incorrectly assumed. We derive the exact analytic expression for the error of the mismatched Bayes estimator and also provide… ▽ More

    Submitted 8 February, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

  11. arXiv:2211.11368  [pdf, other

    math.ST cs.IT cs.LG stat.ML

    Precise Asymptotics for Spectral Methods in Mixed Generalized Linear Models

    Authors: Yihan Zhang, Marco Mondelli, Ramji Venkataramanan

    Abstract: In a mixed generalized linear model, the objective is to learn multiple signals from unlabeled observations: each sample comes from exactly one signal, but it is not known which one. We consider the prototypical problem of estimating two statistically independent signals in a mixed generalized linear model with Gaussian covariates. Spectral methods are a popular class of estimators which output th… ▽ More

    Submitted 18 April, 2024; v1 submitted 21 November, 2022; originally announced November 2022.

  12. arXiv:2205.10009  [pdf, other

    cs.IT cs.LG math.ST

    The price of ignorance: how much does it cost to forget noise structure in low-rank matrix estimation?

    Authors: Jean Barbier, TianQi Hou, Marco Mondelli, Manuel Sáenz

    Abstract: We consider the problem of estimating a rank-1 signal corrupted by structured rotationally invariant noise, and address the following question: how well do inference algorithms perform when the noise statistics is unknown and hence Gaussian noise is assumed? While the matched Bayes-optimal setting with unstructured noise is well understood, the analysis of this mismatched problem is only at its pr… ▽ More

    Submitted 20 May, 2022; originally announced May 2022.

  13. arXiv:2112.04330  [pdf, other

    stat.ML cs.IT cs.LG math.ST

    Estimation in Rotationally Invariant Generalized Linear Models via Approximate Message Passing

    Authors: Ramji Venkataramanan, Kevin Kögler, Marco Mondelli

    Abstract: We consider the problem of signal estimation in generalized linear models defined via rotationally invariant design matrices. Since these matrices can have an arbitrary spectral distribution, this model is well suited for capturing complex correlation structures which often arise in applications. We propose a novel family of approximate message passing (AMP) algorithms for signal estimation, and r… ▽ More

    Submitted 9 June, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: 35 pages, 8 figures, to appear in International Conference on Machine Learning (ICML), 2022

  14. arXiv:2106.02356  [pdf, ps, other

    stat.ML cs.IT cs.LG math.ST

    PCA Initialization for Approximate Message Passing in Rotationally Invariant Models

    Authors: Marco Mondelli, Ramji Venkataramanan

    Abstract: We study the problem of estimating a rank-$1$ signal in the presence of rotationally invariant noise-a class of perturbations more general than Gaussian noise. Principal Component Analysis (PCA) provides a natural estimator, and sharp results on its performance have been obtained in the high-dimensional regime. Recently, an Approximate Message Passing (AMP) algorithm has been proposed as an altern… ▽ More

    Submitted 14 October, 2021; v1 submitted 4 June, 2021; originally announced June 2021.

    Comments: 72 pages, 2 figures, appeared in Neural Information Processing Systems (NeurIPS), 2021

  15. arXiv:2010.03460  [pdf, other

    stat.ML cs.IT cs.LG math.ST

    Approximate Message Passing with Spectral Initialization for Generalized Linear Models

    Authors: Marco Mondelli, Ramji Venkataramanan

    Abstract: We consider the problem of estimating a signal from measurements obtained via a generalized linear model. We focus on estimators based on approximate message passing (AMP), a family of iterative algorithms with many appealing features: the performance of AMP in the high-dimensional limit can be succinctly characterized under suitable model assumptions; AMP can also be tailored to the empirical dis… ▽ More

    Submitted 17 February, 2021; v1 submitted 7 October, 2020; originally announced October 2020.

    Comments: 38 pages, 5 figures, AISTATS 2021

  16. arXiv:2008.03326  [pdf, other

    stat.ML cs.IT cs.LG math.ST

    Optimal Combination of Linear and Spectral Estimators for Generalized Linear Models

    Authors: Marco Mondelli, Christos Thrampoulidis, Ramji Venkataramanan

    Abstract: We study the problem of recovering an unknown signal $\boldsymbol x$ given measurements obtained from a generalized linear model with a Gaussian sensing matrix. Two popular solutions are based on a linear estimator $\hat{\boldsymbol x}^{\rm L}$ and a spectral estimator $\hat{\boldsymbol x}^{\rm s}$. The former is a data-dependent linear combination of the columns of the measurement matrix, and its… ▽ More

    Submitted 25 June, 2021; v1 submitted 7 August, 2020; originally announced August 2020.

    Comments: 49 pages, 6 figures

  17. arXiv:1901.01375  [pdf, other

    math.ST cs.LG

    Analysis of a Two-Layer Neural Network via Displacement Convexity

    Authors: Adel Javanmard, Marco Mondelli, Andrea Montanari

    Abstract: Fitting a function by using linear combinations of a large number $N$ of `simple' components is one of the most fruitful ideas in statistical learning. This idea lies at the core of a variety of methods, from two-layer neural networks to kernel regression, to boosting. In general, the resulting risk minimization problem is non-convex and is solved by gradient descent or its variants. Unfortunately… ▽ More

    Submitted 17 August, 2019; v1 submitted 5 January, 2019; originally announced January 2019.

    Comments: 70 pages, 28 pdf figures