-
Scaling limit for the random walk on critical lattice trees
Authors:
Gérard Ben Arous,
Manuel Cabezas,
Alexander Fribergh
Abstract:
We prove a scaling limit theorem for the simple random walk on critical lattice trees in $\mathbb{Z}^d$, for $d\geq 8$. The scaling limit is the Brownian motion on the Integrated Super-Brownian Excursion (BISE) which is the same one that we have identified earlier for other simpler models of anomalous diffusion on critical graphs in large enough dimension. The proof of this theorem is based on a c…
▽ More
We prove a scaling limit theorem for the simple random walk on critical lattice trees in $\mathbb{Z}^d$, for $d\geq 8$. The scaling limit is the Brownian motion on the Integrated Super-Brownian Excursion (BISE) which is the same one that we have identified earlier for other simpler models of anomalous diffusion on critical graphs in large enough dimension. The proof of this theorem is based on a combination of the tools of lace-expansion (contained in the articles \cite{CFHP} and \cite{CFHP2}), and a new and general convergence theorem.
△ Less
Submitted 28 March, 2025;
originally announced March 2025.
-
Local geometry of high-dimensional mixture models: Effective spectral theory and dynamical transitions
Authors:
Gerard Ben Arous,
Reza Gheissari,
Jiaoyang Huang,
Aukosh Jagannath
Abstract:
We study the local geometry of empirical risks in high dimensions via the spectral theory of their Hessian and information matrices. We focus on settings where the data, $(Y_\ell)_{\ell =1}^n\in \mathbb R^d$, are i.i.d. draws of a $k$-component Gaussian mixture model, and the loss depends on the projection of the data into a fixed number of vectors, namely $\mathbf{x}^\top Y$, where…
▽ More
We study the local geometry of empirical risks in high dimensions via the spectral theory of their Hessian and information matrices. We focus on settings where the data, $(Y_\ell)_{\ell =1}^n\in \mathbb R^d$, are i.i.d. draws of a $k$-component Gaussian mixture model, and the loss depends on the projection of the data into a fixed number of vectors, namely $\mathbf{x}^\top Y$, where $\mathbf{x}\in \mathbb{R}^{d\times C}$ are the parameters, and $C$ need not equal $k$. This setting captures a broad class of problems such as classification by one and two-layer networks and regression on multi-index models. We prove exact formulas for the limits of the empirical spectral distribution and outlier eigenvalues and eigenvectors of such matrices in the proportional asymptotics limit, where the number of samples and dimension $n,d\to\infty$ and $n/d=φ\in (0,\infty)$. These limits depend on the parameters $\mathbf{x}$ only through the summary statistic of the $(C+k)\times (C+k)$ Gram matrix of the parameters and class means, $\mathbf{G} = (\mathbf{x},\mathbfμ)^\top(\mathbf{x},\mathbfμ)$. It is known that under general conditions, when $\mathbf{x}$ is trained by stochastic gradient descent, the evolution of these same summary statistics along training converges to the solution of an autonomous system of ODEs, called the effective dynamics. This enables us to connect the spectral theory to the training dynamics. We demonstrate our general results by analyzing the effective spectrum along the effective dynamics in the case of multi-class logistic regression. In this setting, the empirical Hessian and information matrices have substantially different spectra, each with their own static and even dynamical spectral transitions.
△ Less
Submitted 15 May, 2025; v1 submitted 21 February, 2025;
originally announced February 2025.
-
Permutation recovery of spikes in noisy high-dimensional tensor estimation
Authors:
Gérard Ben Arous,
Cédric Gerbelot,
Vanessa Piccolo
Abstract:
We study the dynamics of gradient flow in high dimensions for the multi-spiked tensor problem, where the goal is to estimate $r$ unknown signal vectors (spikes) from noisy Gaussian tensor observations. Specifically, we analyze the maximum likelihood estimation procedure, which involves optimizing a highly nonconvex random function. We determine the sample complexity required for gradient flow to e…
▽ More
We study the dynamics of gradient flow in high dimensions for the multi-spiked tensor problem, where the goal is to estimate $r$ unknown signal vectors (spikes) from noisy Gaussian tensor observations. Specifically, we analyze the maximum likelihood estimation procedure, which involves optimizing a highly nonconvex random function. We determine the sample complexity required for gradient flow to efficiently recover all spikes, without imposing any assumptions on the separation of the signal-to-noise ratios (SNRs). More precisely, our results provide the sample complexity required to guarantee recovery of the spikes up to a permutation. Our work builds on our companion paper [Ben Arous, Gerbelot, Piccolo 2024], which studies Langevin dynamics and determines the sample complexity and separation conditions for the SNRs necessary for ensuring exact recovery of the spikes (where the recovered permutation matches the identity). During the recovery process, the correlations between the estimators and the hidden vectors increase in a sequential manner. The order in which these correlations become significant depends on their initial values and the corresponding SNRs, which ultimately determines the permutation of the recovered spikes.
△ Less
Submitted 20 December, 2024; v1 submitted 19 December, 2024;
originally announced December 2024.
-
The Larkin Mass and Replica Symmetry Breaking in the Elastic Manifold
Authors:
Gerard Ben Arous,
Pax Kivimae
Abstract:
This is the second of a series of three papers about the Elastic Manifold model. This classical model proposes a rich picture due to the competition between the inherent disorder and the smoothing effect of elasticity. In this paper, we analyze our variational formula for the free energy obtained in our first companion paper [16]. We show that this variational formula may be simplified to one whic…
▽ More
This is the second of a series of three papers about the Elastic Manifold model. This classical model proposes a rich picture due to the competition between the inherent disorder and the smoothing effect of elasticity. In this paper, we analyze our variational formula for the free energy obtained in our first companion paper [16]. We show that this variational formula may be simplified to one which is solved by a unique saddle point. We show that this saddle point may be solved for in terms of the corresponding critical point equation. Moreover, its terms may be interpreted in terms of natural statistics of the model: namely the overlap distribution and effective radius of the model at a given site. Using this characterization, obtain a complete characterization of the replica symmetry breaking phase. From this we are able to confirm a number of physical predictions about this boundary, namely those involving the Larkin mass [6, 53, 54], an important critical mass for the system. The zero-temperature Larkin mass has recently been shown to be the topological trivialization threshold, following work of Fyodorov and Le Doussal [37, 38], made rigorous by the first author, Bourgade and McKenna [12, 13].
△ Less
Submitted 29 October, 2024;
originally announced October 2024.
-
The Free Energy of the Elastic Manifold
Authors:
Gerard Ben Arous,
Pax Kivimae
Abstract:
This is the first of a series of three papers about the Elastic Manifold model. This classical model proposes a rich picture due to the competition between the inherent disorder and the smoothing effect of elasticity. In this paper, we prove a Parisi formula, i.e. we compute the asymptotic quenched free energy and show it is given by the solution to a certain variational problem.
This work comes…
▽ More
This is the first of a series of three papers about the Elastic Manifold model. This classical model proposes a rich picture due to the competition between the inherent disorder and the smoothing effect of elasticity. In this paper, we prove a Parisi formula, i.e. we compute the asymptotic quenched free energy and show it is given by the solution to a certain variational problem.
This work comes after a long and distinguished line of work in the Physics literature, going back to the 1980's (including the foundational work by Daniel Fisher [29], Marc Mezard and Giorgio Parisi [50, 51], and more recently by Yan Fyodorov and Pierre Le Doussal [34, 35]. Even though the mathematical study of Spin Glasses has seen deep progress in the recent years, after the celebrated work by Michel Talagrand [67, 68], the Elastic Manifold model has been studied from a mathematical perspective, only recently and at zero temperature. The annealed topological complexity has been computed, by the first author with Paul Bourgade and Benjamin McKenna [15, 16]. Here we begin the study of this model at positive temperature by computing the quenched free energy.
We obtain our Parisi formula by first applying Laplace's method to reduce the question to a related new family of spherical Spin Glass models with an elastic interaction. The upper bound is then obtained through an interpolation argument initially developed by Francisco Guerra [42] for the study of Spin Glasses. The lower bound follows by adapting the cavity method along the lines explored by Wei-Kuo Chen [23] and the multi-species synchronization method of Dmitry Panchenko [55]. In our next papers [19, 20] we will analyze the consequences of this Parisi formula.
△ Less
Submitted 24 October, 2024;
originally announced October 2024.
-
Stochastic gradient descent in high dimensions for multi-spiked tensor PCA
Authors:
Gérard Ben Arous,
Cédric Gerbelot,
Vanessa Piccolo
Abstract:
We study the dynamics in high dimensions of online stochastic gradient descent for the multi-spiked tensor model. This multi-index model arises from the tensor principal component analysis (PCA) problem with multiple spikes, where the goal is to estimate $r$ unknown signal vectors within the $N$-dimensional unit sphere through maximum likelihood estimation from noisy observations of a $p$-tensor.…
▽ More
We study the dynamics in high dimensions of online stochastic gradient descent for the multi-spiked tensor model. This multi-index model arises from the tensor principal component analysis (PCA) problem with multiple spikes, where the goal is to estimate $r$ unknown signal vectors within the $N$-dimensional unit sphere through maximum likelihood estimation from noisy observations of a $p$-tensor. We determine the number of samples and the conditions on the signal-to-noise ratios (SNRs) required to efficiently recover the unknown spikes from natural random initializations. We show that full recovery of all spikes is possible provided a number of sample scaling as $N^{p-2}$, matching the algorithmic threshold identified in the rank-one case [Ben Arous, Gheissari, Jagannath 2020, 2021]. Our results are obtained through a detailed analysis of a low-dimensional system that describes the evolution of the correlations between the estimators and the spikes, while controlling the noise in the dynamics. We find that the spikes are recovered sequentially in a process we term "sequential elimination": once a correlation exceeds a critical threshold, all correlations sharing a row or column index become sufficiently small, allowing the next correlation to grow and become macroscopic. The order in which correlations become macroscopic depends on their initial values and the corresponding SNRs, leading to either exact recovery or recovery of a permutation of the spikes. In the matrix case, when $p=2$, if the SNRs are sufficiently separated, we achieve exact recovery of the spikes, whereas equal SNRs lead to recovery of the subspace spanned by the spikes.
△ Less
Submitted 23 October, 2024;
originally announced October 2024.
-
Langevin dynamics for high-dimensional optimization: the case of multi-spiked tensor PCA
Authors:
Gérard Ben Arous,
Cédric Gerbelot,
Vanessa Piccolo
Abstract:
We study nonconvex optimization in high dimensions through Langevin dynamics, focusing on the multi-spiked tensor PCA problem. This tensor estimation problem involves recovering $r$ hidden signal vectors (spikes) from noisy Gaussian tensor observations using maximum likelihood estimation. We study the number of samples required for Langevin dynamics to efficiently recover the spikes and determine…
▽ More
We study nonconvex optimization in high dimensions through Langevin dynamics, focusing on the multi-spiked tensor PCA problem. This tensor estimation problem involves recovering $r$ hidden signal vectors (spikes) from noisy Gaussian tensor observations using maximum likelihood estimation. We study the number of samples required for Langevin dynamics to efficiently recover the spikes and determine the necessary separation condition on the signal-to-noise ratios (SNRs) for exact recovery, distinguishing the cases $p \ge 3$ and $p=2$, where $p$ denotes the order of the tensor. In particular, we show that the sample complexity required for recovering the spike associated with the largest SNR matches the well-known algorithmic threshold for the single-spike case, while this threshold degrades when recovering all $r$ spikes. As a key step, we provide a detailed characterization of the trajectory and interactions of low-dimensional projections that capture the high-dimensional dynamics.
△ Less
Submitted 19 December, 2024; v1 submitted 12 August, 2024;
originally announced August 2024.
-
Spectral alignment of stochastic gradient descent for high-dimensional classification tasks
Authors:
Gerard Ben Arous,
Reza Gheissari,
Jiaoyang Huang,
Aukosh Jagannath
Abstract:
We rigorously study the relation between the training dynamics via stochastic gradient descent (SGD) and the spectra of empirical Hessian and gradient matrices. We prove that in two canonical classification tasks for multi-class high-dimensional mixtures and either 1 or 2-layer neural networks, both the SGD trajectory and emergent outlier eigenspaces of the Hessian and gradient matrices align with…
▽ More
We rigorously study the relation between the training dynamics via stochastic gradient descent (SGD) and the spectra of empirical Hessian and gradient matrices. We prove that in two canonical classification tasks for multi-class high-dimensional mixtures and either 1 or 2-layer neural networks, both the SGD trajectory and emergent outlier eigenspaces of the Hessian and gradient matrices align with a common low-dimensional subspace. Moreover, in multi-layer settings this alignment occurs per layer, with the final layer's outlier eigenspace evolving over the course of training, and exhibiting rank deficiency when the SGD converges to sub-optimal classifiers. This establishes some of the rich predictions that have arisen from extensive numerical studies in the last decade about the spectra of Hessian and information matrices over the course of training in overparametrized networks.
△ Less
Submitted 15 May, 2025; v1 submitted 4 October, 2023;
originally announced October 2023.
-
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling
Authors:
Gerard Ben Arous,
Reza Gheissari,
Aukosh Jagannath
Abstract:
We study the scaling limits of stochastic gradient descent (SGD) with constant step-size in the high-dimensional regime. We prove limit theorems for the trajectories of summary statistics (i.e., finite-dimensional functions) of SGD as the dimension goes to infinity. Our approach allows one to choose the summary statistics that are tracked, the initialization, and the step-size. It yields both ball…
▽ More
We study the scaling limits of stochastic gradient descent (SGD) with constant step-size in the high-dimensional regime. We prove limit theorems for the trajectories of summary statistics (i.e., finite-dimensional functions) of SGD as the dimension goes to infinity. Our approach allows one to choose the summary statistics that are tracked, the initialization, and the step-size. It yields both ballistic (ODE) and diffusive (SDE) limits, with the limit depending dramatically on the former choices. We show a critical scaling regime for the step-size, below which the effective ballistic dynamics matches gradient flow for the population loss, but at which, a new correction term appears which changes the phase diagram. About the fixed points of this effective dynamics, the corresponding diffusive limits can be quite complex and even degenerate. We demonstrate our approach on popular examples including estimation for spiked matrix and tensor models and classification via two-layer networks for binary and XOR-type Gaussian mixture models. These examples exhibit surprising phenomena including multimodal timescales to convergence as well as convergence to sub-optimal solutions with probability bounded away from zero from random (e.g., Gaussian) initializations. At the same time, we demonstrate the benefit of overparametrization by showing that the latter probability goes to zero as the second layer width grows.
△ Less
Submitted 17 August, 2023; v1 submitted 8 June, 2022;
originally announced June 2022.
-
Long Random Matrices and Tensor Unfolding
Authors:
Gérard Ben Arous,
Daniel Zhengyu Huang,
Jiaoyang Huang
Abstract:
In this paper, we consider the singular values and singular vectors of low rank perturbations of large rectangular random matrices, in the regime the matrix is "long": we allow the number of rows (columns) to grow polynomially in the number of columns (rows). We prove there exists a critical signal-to-noise ratio (depending on the dimensions of the matrix), and the extreme singular values and sing…
▽ More
In this paper, we consider the singular values and singular vectors of low rank perturbations of large rectangular random matrices, in the regime the matrix is "long": we allow the number of rows (columns) to grow polynomially in the number of columns (rows). We prove there exists a critical signal-to-noise ratio (depending on the dimensions of the matrix), and the extreme singular values and singular vectors exhibit a BBP type phase transition. As a main application, we investigate the tensor unfolding algorithm for the asymmetric rank-one spiked tensor model, and obtain an exact threshold, which is independent of the procedure of tensor unfolding. If the signal-to-noise ratio is above the threshold, tensor unfolding detects the signals; otherwise, it fails to capture the signals.
△ Less
Submitted 19 October, 2021;
originally announced October 2021.
-
Sharp complexity asymptotics and topological trivialization for the (p, k) spiked tensor model
Authors:
Antonio Auffinger,
Gerard Ben Arous,
Zhehua Li
Abstract:
We provide O(1) asymptotics for the average number of deep minima of the (p,k) spiked tensor model. We also derive an explicit formula for the limiting ground state energy on the N-dimensional sphere, similar to the work of Jagannath-Lopatto-Miolane. Moreover, when the signal to noise ratio is large enough, the expected number of deep minima is asymptotically finite as N tends to infinity and we d…
▽ More
We provide O(1) asymptotics for the average number of deep minima of the (p,k) spiked tensor model. We also derive an explicit formula for the limiting ground state energy on the N-dimensional sphere, similar to the work of Jagannath-Lopatto-Miolane. Moreover, when the signal to noise ratio is large enough, the expected number of deep minima is asymptotically finite as N tends to infinity and we determine its limit as the signal-to-noise ratio diverges.
△ Less
Submitted 17 June, 2021;
originally announced June 2021.
-
Landscape complexity beyond invariance and the elastic manifold
Authors:
Gérard Ben Arous,
Paul Bourgade,
Benjamin McKenna
Abstract:
This paper characterizes the annealed, topological complexity (both of total critical points and of local minima) of the elastic manifold. This classical model of a disordered elastic system captures point configurations with self-interactions in a random medium. We establish the simple-vs.-glassy phase diagram in the model parameters, with these phases separated by a physical boundary known as th…
▽ More
This paper characterizes the annealed, topological complexity (both of total critical points and of local minima) of the elastic manifold. This classical model of a disordered elastic system captures point configurations with self-interactions in a random medium. We establish the simple-vs.-glassy phase diagram in the model parameters, with these phases separated by a physical boundary known as the Larkin mass, confirming formulas of Fyodorov and Le Doussal.
One essential, dynamical, step of the proof also applies to a general signal-to-noise model of soft spins in an anisotropic well, for which we prove a negative-second-moment threshold distinguishing positive from zero complexity. A universal near-critical behavior appears within this phase portrait, namely quadratic near-critical vanishing of the complexity of total critical points, and cubic near-critical vanishing of the complexity of local minima.
These two models serve as a paradigm of complexity calculations for Gaussian landscapes exhibiting few distributional symmetries, i.e. beyond the invariant setting. The two main inputs for the proof are determinant asymptotics for non-invariant random matrices from our companion paper [Ben Arous, Bourgade, McKenna 2021], and the atypical convexity and integrability of the limiting variational problems.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
Exponential growth of random determinants beyond invariance
Authors:
Gérard Ben Arous,
Paul Bourgade,
Benjamin McKenna
Abstract:
We give simple criteria to identify the exponential order of magnitude of the absolute value of the determinant for wide classes of random matrix models, not requiring the assumption of invariance. These include Gaussian matrices with covariance profiles, Wigner matrices and covariance matrices with subexponential tails, Erdős-Rényi and $d$-regular graphs for any polynomial sparsity parameter, and…
▽ More
We give simple criteria to identify the exponential order of magnitude of the absolute value of the determinant for wide classes of random matrix models, not requiring the assumption of invariance. These include Gaussian matrices with covariance profiles, Wigner matrices and covariance matrices with subexponential tails, Erdős-Rényi and $d$-regular graphs for any polynomial sparsity parameter, and non-mean-field random matrix models, such as random band matrices for any polynomial bandwidth. The proof builds on recent tools, including the theory of the Matrix Dyson Equation as developed in [Ajanki, Erdős, Krüger 2019].
We use these asymptotics as an important input to identify the complexity of classes of Gaussian random landscapes in our companion papers [Ben Arous, Bourgade, McKenna 2021; McKenna 2021].
△ Less
Submitted 15 April, 2022; v1 submitted 11 May, 2021;
originally announced May 2021.
-
Shattering Versus Metastability in Spin Glasses
Authors:
Gérard Ben Arous,
Aukosh Jagannath
Abstract:
Our goal in this work is to better understand the relationship between replica symmetry breaking, shattering, and metastability. To this end, we study the static and dynamic behaviour of spherical pure $p$-spin glasses above the replica symmetry breaking temperature $T_{s}$. In this regime, we find that there are at least two distinct temperatures related to non-trivial behaviour. First we prove t…
▽ More
Our goal in this work is to better understand the relationship between replica symmetry breaking, shattering, and metastability. To this end, we study the static and dynamic behaviour of spherical pure $p$-spin glasses above the replica symmetry breaking temperature $T_{s}$. In this regime, we find that there are at least two distinct temperatures related to non-trivial behaviour. First we prove that there is a regime of temperatures in which the spherical $p$-spin model exhibits a shattering phase. Our results holds in a regime above but near $T_s$. We then find that metastable states exist up to an even higher temperature $T_{BBM}$ as predicted by Barrat--Burioni--Mézard which is expected to be higher than the phase boundary for the shattering phase $T_d <T_{BBM}$. We develop this work by first developing a Thouless--Anderson--Palmer decomposition which builds on the work of Subag. We then present a series of questions and conjectures regarding the sharp phase boundaries for shattering and slow mixing.
△ Less
Submitted 16 April, 2021;
originally announced April 2021.
-
Counting equilibria of large complex systems by instability index
Authors:
Gérard Ben Arous,
Yan V Fyodorov,
Boris A Khoruzhenko
Abstract:
We consider a nonlinear autonomous system of $N\gg 1$ degrees of freedom randomly coupled by both relaxational ('gradient') and non-relaxational ('solenoidal') random interactions. We show that with increased interaction strength such systems generically undergo an abrupt transition from a trivial phase portrait with a single stable equilibrium into a topologically non-trivial regime of 'absolute…
▽ More
We consider a nonlinear autonomous system of $N\gg 1$ degrees of freedom randomly coupled by both relaxational ('gradient') and non-relaxational ('solenoidal') random interactions. We show that with increased interaction strength such systems generically undergo an abrupt transition from a trivial phase portrait with a single stable equilibrium into a topologically non-trivial regime of 'absolute instability' where equilibria are on average exponentially abundant, but typically all of them are unstable, unless the dynamics is purely gradient. When interactions increase even further the stable equilibria eventually become on average exponentially abundant unless the interaction is purely solenoidal. We further calculate the mean proportion of equilibria which have a fixed fraction of unstable directions.
△ Less
Submitted 2 July, 2021; v1 submitted 3 August, 2020;
originally announced August 2020.
-
Free Energy Wells and Overlap Gap Property in Sparse PCA
Authors:
Gérard Ben Arous,
Alexander S. Wein,
Ilias Zadik
Abstract:
We study a variant of the sparse PCA (principal component analysis) problem in the "hard" regime, where the inference task is possible yet no polynomial-time algorithm is known to exist. Prior work, based on the low-degree likelihood ratio, has conjectured a precise expression for the best possible (sub-exponential) runtime throughout the hard regime. Following instead a statistical physics inspir…
▽ More
We study a variant of the sparse PCA (principal component analysis) problem in the "hard" regime, where the inference task is possible yet no polynomial-time algorithm is known to exist. Prior work, based on the low-degree likelihood ratio, has conjectured a precise expression for the best possible (sub-exponential) runtime throughout the hard regime. Following instead a statistical physics inspired point of view, we show bounds on the depth of free energy wells for various Gibbs measures naturally associated to the problem. These free energy wells imply hitting time lower bounds that corroborate the low-degree conjecture: we show that a class of natural MCMC (Markov chain Monte Carlo) methods (with worst-case initialization) cannot solve sparse PCA with less than the conjectured runtime. These lower bounds apply to a wide range of values for two tuning parameters: temperature and sparsity misparametrization. Finally, we prove that the Overlap Gap Property (OGP), a structural property that implies failure of certain local search algorithms, holds in a significant part of the hard regime.
△ Less
Submitted 18 June, 2020;
originally announced June 2020.
-
Online stochastic gradient descent on non-convex losses from high-dimensional inference
Authors:
Gerard Ben Arous,
Reza Gheissari,
Aukosh Jagannath
Abstract:
Stochastic gradient descent (SGD) is a popular algorithm for optimization problems arising in high-dimensional inference tasks. Here one produces an estimator of an unknown parameter from independent samples of data by iteratively optimizing a loss function. This loss function is random and often non-convex. We study the performance of the simplest version of SGD, namely online SGD, from a random…
▽ More
Stochastic gradient descent (SGD) is a popular algorithm for optimization problems arising in high-dimensional inference tasks. Here one produces an estimator of an unknown parameter from independent samples of data by iteratively optimizing a loss function. This loss function is random and often non-convex. We study the performance of the simplest version of SGD, namely online SGD, from a random start in the setting where the parameter space is high-dimensional.
We develop nearly sharp thresholds for the number of samples needed for consistent estimation as one varies the dimension. Our thresholds depend only on an intrinsic property of the population loss which we call the information exponent. In particular, our results do not assume uniform control on the loss itself, such as convexity or uniform derivative bounds. The thresholds we obtain are polynomial in the dimension and the precise exponent depends explicitly on the information exponent. As a consequence of our results, we find that except for the simplest tasks, almost all of the data is used simply in the initial search phase to obtain non-trivial correlation with the ground truth. Upon attaining non-trivial correlation, the descent is rapid and exhibits law of large numbers type behavior.
We illustrate our approach by applying it to a wide set of inference tasks such as phase retrieval, and parameter estimation for generalized linear models, online PCA, and spiked tensor models, as well as to supervised learning for single-layer networks with general activation functions.
△ Less
Submitted 10 May, 2021; v1 submitted 23 March, 2020;
originally announced March 2020.
-
Landscape Complexity for the Empirical Risk of Generalized Linear Models
Authors:
Antoine Maillard,
Gérard Ben Arous,
Giulio Biroli
Abstract:
We present a method to obtain the average and the typical value of the number of critical points of the empirical risk landscape for generalized linear estimation problems and variants. This represents a substantial extension of previous applications of the Kac-Rice method since it allows to analyze the critical points of high dimensional non-Gaussian random functions. Under a technical hypothesis…
▽ More
We present a method to obtain the average and the typical value of the number of critical points of the empirical risk landscape for generalized linear estimation problems and variants. This represents a substantial extension of previous applications of the Kac-Rice method since it allows to analyze the critical points of high dimensional non-Gaussian random functions. Under a technical hypothesis, we obtain a rigorous explicit variational formula for the annealed complexity, which is the logarithm of the average number of critical points at fixed value of the empirical risk. This result is simplified, and extended, using the non-rigorous Kac-Rice replicated method from theoretical physics. In this way we find an explicit variational formula for the quenched complexity, which is generally different from its annealed counterpart, and allows to obtain the number of critical points for typical instances up to exponential accuracy.
△ Less
Submitted 18 January, 2023; v1 submitted 4 December, 2019;
originally announced December 2019.
-
Very rare events for diffusion processes in short time
Authors:
Gérard Ben Arous,
Jing Wang
Abstract:
We study the large deviation estimates for the short time asymptotic behavior of a strongly degenerate diffusion process. Assuming a nilpotent structure of the Lie algebra generated by the driving vector fields, we obtain a graded large deviation principle and prove the existence of those "very rare events". In particular the first grade coincides with the classical Large Deviation Principle.
We study the large deviation estimates for the short time asymptotic behavior of a strongly degenerate diffusion process. Assuming a nilpotent structure of the Lie algebra generated by the driving vector fields, we obtain a graded large deviation principle and prove the existence of those "very rare events". In particular the first grade coincides with the classical Large Deviation Principle.
△ Less
Submitted 28 January, 2019;
originally announced January 2019.
-
Bounding flows for spherical spin glass dynamics
Authors:
Gerard Ben Arous,
Reza Gheissari,
Aukosh Jagannath
Abstract:
We introduce a new approach to studying spherical spin glass dynamics based on differential inequalities for one-time observables. Using this approach, we obtain an approximate phase diagram for the evolution of the energy $H$ and its gradient under Langevin dynamics for spherical $p$-spin models. We then derive several consequences of this phase diagram. For example, at any temperature, uniformly…
▽ More
We introduce a new approach to studying spherical spin glass dynamics based on differential inequalities for one-time observables. Using this approach, we obtain an approximate phase diagram for the evolution of the energy $H$ and its gradient under Langevin dynamics for spherical $p$-spin models. We then derive several consequences of this phase diagram. For example, at any temperature, uniformly over all starting points, the process must reach and remain in an absorbing region of large negative values of $H$ and large (in norm) gradients in order 1 time. Furthermore, if the process starts in a neighborhood of a critical point of $H$ with negative energy, then both the gradient and energy must increase macroscopically under this evolution, even if this critical point is a saddle with index of order $N$. As a key technical tool, we estimate Sobolev norms of spin glass Hamiltonians, which are of independent interest.
△ Less
Submitted 24 October, 2019; v1 submitted 2 August, 2018;
originally announced August 2018.
-
Algorithmic thresholds for tensor PCA
Authors:
Gerard Ben Arous,
Reza Gheissari,
Aukosh Jagannath
Abstract:
We study the algorithmic thresholds for principal component analysis of Gaussian $k$-tensors with a planted rank-one spike, via Langevin dynamics and gradient descent. In order to efficiently recover the spike from natural initializations, the signal to noise ratio must diverge in the dimension. Our proof shows that the mechanism for the success/failure of recovery is the strength of the "curvatur…
▽ More
We study the algorithmic thresholds for principal component analysis of Gaussian $k$-tensors with a planted rank-one spike, via Langevin dynamics and gradient descent. In order to efficiently recover the spike from natural initializations, the signal to noise ratio must diverge in the dimension. Our proof shows that the mechanism for the success/failure of recovery is the strength of the "curvature" of the spike on the maximum entropy region of the initial data. To demonstrate this, we study the dynamics on a generalized family of high-dimensional landscapes with planted signals, containing the spiked tensor models as specific instances. We identify thresholds of signal-to-noise ratios above which order 1 time recovery succeeds; in the case of the spiked tensor model these match the thresholds conjectured for algorithms such as Approximate Message Passing. Below these thresholds, where the curvature of the signal on the maximal entropy region is weak, we show that recovery from certain natural initializations takes at least stretched exponential time. Our approach combines global regularity estimates for spin glasses with point-wise estimates, to study the recovery problem by a perturbative approach.
△ Less
Submitted 10 September, 2019; v1 submitted 2 August, 2018;
originally announced August 2018.
-
Geometry and temperature chaos in mixed spherical spin glasses at low temperature - the perturbative regime
Authors:
Gérard Ben Arous,
Eliran Subag,
Ofer Zeitouni
Abstract:
We study the Gibbs measure of mixed spherical $p$-spin glass models at low temperature, in (part of) the 1-RSB regime, including, in particular, models close to pure in an appropriate sense. We show that the Gibbs measure concentrates on spherical bands around deep critical points of the (extended) Hamiltonian restricted to the sphere of radius $\sqrt N q_\star$, where $q_\star^2$ is the rightmost…
▽ More
We study the Gibbs measure of mixed spherical $p$-spin glass models at low temperature, in (part of) the 1-RSB regime, including, in particular, models close to pure in an appropriate sense. We show that the Gibbs measure concentrates on spherical bands around deep critical points of the (extended) Hamiltonian restricted to the sphere of radius $\sqrt N q_\star$, where $q_\star^2$ is the rightmost point in the support of the overlap distribution. We also show that the relevant critical points are pairwise orthogonal for two different low temperatures. This allows us to explain why temperature chaos occurs for those models, in contrast to the pure spherical models.
△ Less
Submitted 27 April, 2018;
originally announced April 2018.
-
Complex energy landscapes in spiked-tensor and simple glassy models: ruggedness, arrangements of local minima and phase transitions
Authors:
Valentina Ros,
Gerard Ben Arous,
Giulio Biroli,
Chiara Cammarota
Abstract:
We study rough high-dimensional landscapes in which an increasingly stronger preference for a given configuration emerges. Such energy landscapes arise in glass physics and inference. In particular we focus on random Gaussian functions, and on the spiked-tensor model and generalizations. We thoroughly analyze the statistical properties of the corresponding landscapes and characterize the associate…
▽ More
We study rough high-dimensional landscapes in which an increasingly stronger preference for a given configuration emerges. Such energy landscapes arise in glass physics and inference. In particular we focus on random Gaussian functions, and on the spiked-tensor model and generalizations. We thoroughly analyze the statistical properties of the corresponding landscapes and characterize the associated geometrical phase transitions. In order to perform our study, we develop a framework based on the Kac-Rice method that allows to compute the complexity of the landscape, i.e. the logarithm of the typical number of stationary points and their Hessian. This approach generalizes the one used to compute rigorously the annealed complexity of mean-field glass models. We discuss its advantages with respect to previous frameworks, in particular the thermodynamical replica method which is shown to lead to partially incorrect predictions.
△ Less
Submitted 24 April, 2018; v1 submitted 8 April, 2018;
originally announced April 2018.
-
Comparing Dynamics: Deep Neural Networks versus Glassy Systems
Authors:
M. Baity-Jesi,
L. Sagun,
M. Geiger,
S. Spigler,
G. Ben Arous,
C. Cammarota,
Y. LeCun,
M. Wyart,
G. Biroli
Abstract:
We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems. The two main issues we address are (1) the complexity of the loss landscape and of the dynamics within it, and (2) to what extent DNNs share similarities with glassy systems. Our findings, obtained for different architectures and datasets, suggest that dur…
▽ More
We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems. The two main issues we address are (1) the complexity of the loss landscape and of the dynamics within it, and (2) to what extent DNNs share similarities with glassy systems. Our findings, obtained for different architectures and datasets, suggest that during the training process the dynamics slows down because of an increasingly large number of flat directions. At large times, when the loss is approaching zero, the system diffuses at the bottom of the landscape. Despite some similarities with the dynamics of mean-field glassy systems, in particular, the absence of barrier crossing, we find distinctive dynamical behaviors in the two cases, showing that the statistical properties of the corresponding loss and energy landscapes are different. In contrast, when the network is under-parametrized we observe a typical glassy behavior, thus suggesting the existence of different phases depending on whether the network is under-parametrized or over-parametrized.
△ Less
Submitted 7 June, 2018; v1 submitted 19 March, 2018;
originally announced March 2018.
-
The landscape of the spiked tensor model
Authors:
Gerard Ben Arous,
Song Mei,
Andrea Montanari,
Mihai Nica
Abstract:
We consider the problem of estimating a large rank-one tensor ${\boldsymbol u}^{\otimes k}\in({\mathbb R}^{n})^{\otimes k}$, $k\ge 3$ in Gaussian noise. Earlier work characterized a critical signal-to-noise ratio $λ_{Bayes}= O(1)$ above which an ideal estimator achieves strictly positive correlation with the unknown vector of interest. Remarkably no polynomial-time algorithm is known that achieved…
▽ More
We consider the problem of estimating a large rank-one tensor ${\boldsymbol u}^{\otimes k}\in({\mathbb R}^{n})^{\otimes k}$, $k\ge 3$ in Gaussian noise. Earlier work characterized a critical signal-to-noise ratio $λ_{Bayes}= O(1)$ above which an ideal estimator achieves strictly positive correlation with the unknown vector of interest. Remarkably no polynomial-time algorithm is known that achieved this goal unless $λ\ge C n^{(k-2)/4}$ and even powerful semidefinite programming relaxations appear to fail for $1\ll λ\ll n^{(k-2)/4}$.
In order to elucidate this behavior, we consider the maximum likelihood estimator, which requires maximizing a degree-$k$ homogeneous polynomial over the unit sphere in $n$ dimensions. We compute the expected number of critical points and local maxima of this objective function and show that it is exponential in the dimensions $n$, and give exact formulas for the exponential growth rate. We show that (for $λ$ larger than a constant) critical points are either very close to the unknown vector ${\boldsymbol u}$, or are confined in a band of width $Θ(λ^{-1/(k-1)})$ around the maximum circle that is orthogonal to ${\boldsymbol u}$. For local maxima, this band shrinks to be of size $Θ(λ^{-1/(k-2)})$. These `uninformative' local maxima are likely to cause the failure of optimization algorithms.
△ Less
Submitted 25 January, 2018; v1 submitted 15 November, 2017;
originally announced November 2017.
-
Stable limit laws and structure of the scaling function for reaction-diffusion in random environment
Authors:
Gérard Ben Arous,
Stanislav Molchanov,
Alejandro F. Ramírez
Abstract:
We prove the emergence of stable fluctuations for reaction-diffusion in random environment with Weibull tails. This completes our work around the quenched to annealed transition phenomenon in this context of reaction diffusion. In [9], we had already considered the model treated here and had studied fully the regimes where the law of large numbers is satisfied and where the fluctuations are Gaussi…
▽ More
We prove the emergence of stable fluctuations for reaction-diffusion in random environment with Weibull tails. This completes our work around the quenched to annealed transition phenomenon in this context of reaction diffusion. In [9], we had already considered the model treated here and had studied fully the regimes where the law of large numbers is satisfied and where the fluctuations are Gaussian, but we had left open the regime of stable fluctuations. Our work is based on a spectral approach centered on the classical theory of rank-one perturbations. It illustrates the gradual emergence of the role of the higher peaks of the environments. This approach also allows us to give the delicate exact asymptotics of the normalizing constants needed in the stable limit law.
△ Less
Submitted 20 June, 2017;
originally announced June 2017.
-
Backbone scaling limits for random walks on random critical trees
Authors:
Gérard Ben Arous,
Manuel Cabezas,
Alexander Fribergh
Abstract:
We prove the existence of scaling limits for the projection on the backbone of the random walks on the Incipient Infinite Cluster and the Invasion Percolation Cluster on a regular tree. We treat these projected random walks as randomly trapped random walks (as defined in [BCČR15]) and thus describe these scaling limits as spatially subordinated Brownian motions
We prove the existence of scaling limits for the projection on the backbone of the random walks on the Incipient Infinite Cluster and the Invasion Percolation Cluster on a regular tree. We treat these projected random walks as randomly trapped random walks (as defined in [BCČR15]) and thus describe these scaling limits as spatially subordinated Brownian motions
△ Less
Submitted 15 October, 2021; v1 submitted 16 May, 2017;
originally announced May 2017.
-
Spectral gap estimates in mean field spin glasses
Authors:
Gérard Ben Arous,
Aukosh Jagannath
Abstract:
We show that mixing for local, reversible dynamics of mean field spin glasses is exponentially slow in the low temperature regime. We introduce a notion of free energy barriers for the overlap, and prove that their existence imply that the spectral gap is exponentially small, and thus that mixing is exponentially slow. We then exhibit sufficient conditions on the equilibrium Gibbs measure which gu…
▽ More
We show that mixing for local, reversible dynamics of mean field spin glasses is exponentially slow in the low temperature regime. We introduce a notion of free energy barriers for the overlap, and prove that their existence imply that the spectral gap is exponentially small, and thus that mixing is exponentially slow. We then exhibit sufficient conditions on the equilibrium Gibbs measure which guarantee the existence of these barriers, using the notion of replicon eigenvalue and 2D Guerra Talagrand bounds. We show how these sufficient conditions cover large classes of Ising spin models for reversible nearest-neighbor dynamics and spherical models for Langevin dynamics. Finally, in the case of Ising spins, Panchenko's recent rigorous calculation [79] of the free energy for a system of "two real replica" enables us to prove a quenched LDP for the overlap distribution, which gives us a wider criterion for slow mixing directly related to the Franz-Parisi-Virasoro approach [43,60]. This condition holds in a wider range of temperatures.
△ Less
Submitted 2 March, 2018; v1 submitted 11 May, 2017;
originally announced May 2017.
-
Scaling limit for the ant in a simple labyrinth
Authors:
Gérard Ben Arous,
Manuel Cabezas,
Alexander Fribergh
Abstract:
We prove that, after suitable rescaling, the simple random walk on the trace of a large critical branching random walk converges to the Brownian motion on the integrated super-Brownian excursion.
We prove that, after suitable rescaling, the simple random walk on the trace of a large critical branching random walk converges to the Brownian motion on the integrated super-Brownian excursion.
△ Less
Submitted 13 September, 2016;
originally announced September 2016.
-
Scaling limit for the ant in high-dimensional labyrinths
Authors:
Gérard Ben Arous,
Manuel Cabezas,
Alexander Fribergh
Abstract:
We study here a detailed conjecture regarding one of the most important cases of anomalous diffusion, i.e the behavior of the "ant in the labyrinth". It is natural to conjecture (see [16] and [8]) that the scaling limit for random walks on large critical random graphs exists in high dimensions, and is universal. This scaling limit is simply the natural Brownian Motion on the Integrated Super-Brown…
▽ More
We study here a detailed conjecture regarding one of the most important cases of anomalous diffusion, i.e the behavior of the "ant in the labyrinth". It is natural to conjecture (see [16] and [8]) that the scaling limit for random walks on large critical random graphs exists in high dimensions, and is universal. This scaling limit is simply the natural Brownian Motion on the Integrated Super-Brownian Excursion. We give here a set of four natural sufficient conditions on the critical graphs and prove that this set of assumptions ensures the validity of this conjecture. The remaining future task is to prove that these sufficient conditions hold for the various classical cases of critical random structures, like the usual Bernoulli bond percolation, oriented percolation, spread-out percolation in high enough dimension. In the companion paper [10], we do precisely that in a first case, the random walk on the trace of a large critical branching random walk. We verify the validity of these sufficient conditions and thus obtain the scaling limit mentioned above, in dimensions larger than 14.
△ Less
Submitted 13 September, 2016;
originally announced September 2016.
-
Explorations on high dimensional landscapes
Authors:
Levent Sagun,
V. Ugur Guney,
Gerard Ben Arous,
Yann LeCun
Abstract:
Finding minima of a real valued non-convex function over a high dimensional space is a major challenge in science. We provide evidence that some such functions that are defined on high dimensional domains have a narrow band of values whose pre-image contains the bulk of its critical points. This is in contrast with the low dimensional picture in which this band is wide. Our simulations agree with…
▽ More
Finding minima of a real valued non-convex function over a high dimensional space is a major challenge in science. We provide evidence that some such functions that are defined on high dimensional domains have a narrow band of values whose pre-image contains the bulk of its critical points. This is in contrast with the low dimensional picture in which this band is wide. Our simulations agree with the previous theoretical work on spin glasses that proves the existence of such a band when the dimension of the domain tends to infinity. Furthermore our experiments on teacher-student networks with the MNIST dataset establish a similar phenomenon in deep networks. We finally observe that both the gradient descent and the stochastic gradient descent methods can reach this level within the same number of steps.
△ Less
Submitted 6 April, 2015; v1 submitted 20 December, 2014;
originally announced December 2014.
-
The Loss Surfaces of Multilayer Networks
Authors:
Anna Choromanska,
Mikael Henaff,
Michael Mathieu,
Gérard Ben Arous,
Yann LeCun
Abstract:
We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network t…
▽ More
We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network through the prism of the results from random matrix theory. We show that for large-size decoupled networks the lowest critical values of the random loss function form a layered structure and they are located in a well-defined band lower-bounded by the global minimum. The number of local minima outside that band diminishes exponentially with the size of the network. We empirically verify that the mathematical model exhibits similar behavior as the computer simulations, despite the presence of high dependencies in real networks. We conjecture that both simulated annealing and SGD converge to the band of low critical points, and that all critical points found there are local minima of high quality measured by the test error. This emphasizes a major difference between large- and small-size networks where for the latter poor quality local minima have non-zero probability of being recovered. Finally, we prove that recovering the global minimum becomes harder as the network size increases and that it is in practice irrelevant as global minimum often leads to overfitting.
△ Less
Submitted 21 January, 2015; v1 submitted 30 November, 2014;
originally announced December 2014.
-
Biased random walks on random graphs
Authors:
Gerard Ben Arous,
Alexander Fribergh
Abstract:
These notes cover one of the topics programmed for the St Petersburg School in Probability and Statistical Physics of June 2012.
The aim is to review recent mathematical developments in the field of random walks in random environment. Our main focus will be on directionally transient and reversible random walks on different types of underlying graph structures, such as $\mathbb{Z}$, trees and…
▽ More
These notes cover one of the topics programmed for the St Petersburg School in Probability and Statistical Physics of June 2012.
The aim is to review recent mathematical developments in the field of random walks in random environment. Our main focus will be on directionally transient and reversible random walks on different types of underlying graph structures, such as $\mathbb{Z}$, trees and $\mathbb{Z}^d$ for $d\geq 2$.
△ Less
Submitted 19 June, 2014;
originally announced June 2014.
-
Smallest Singular Value for Perturbations of Random Permutation Matrices
Authors:
Gérard Ben Arous,
Kim Dang
Abstract:
We take a first small step to extend the validity of Rudelson-Vershynin type estimates to some sparse random matrices, here random permutation matrices. We give lower (and upper) bounds on the smallest singular value of a large random matrix D+M where M is a random permutation matrix, sampled uniformly, and D is diagonal. When D is itself random with i.i.d terms on the diagonal, we obtain a Rudels…
▽ More
We take a first small step to extend the validity of Rudelson-Vershynin type estimates to some sparse random matrices, here random permutation matrices. We give lower (and upper) bounds on the smallest singular value of a large random matrix D+M where M is a random permutation matrix, sampled uniformly, and D is diagonal. When D is itself random with i.i.d terms on the diagonal, we obtain a Rudelson-Vershynin type estimate, using the classical theory of random walks with negative drift.
△ Less
Submitted 15 April, 2014;
originally announced April 2014.
-
Randomly trapped random walks
Authors:
Gérard Ben Arous,
Manuel Cabezas,
Jiří Černý,
Roman Royfman
Abstract:
We introduce a general model of trapping for random walks on graphs. We give the possible scaling limits of these Randomly Trapped Random Walks on $\mathbb {Z}$. These scaling limits include the well-known fractional kinetics process, the Fontes-Isopi-Newman singular diffusion as well as a new broad class we call spatially subordinated Brownian motions. We give sufficient conditions for convergenc…
▽ More
We introduce a general model of trapping for random walks on graphs. We give the possible scaling limits of these Randomly Trapped Random Walks on $\mathbb {Z}$. These scaling limits include the well-known fractional kinetics process, the Fontes-Isopi-Newman singular diffusion as well as a new broad class we call spatially subordinated Brownian motions. We give sufficient conditions for convergence and illustrate these on two important examples.
△ Less
Submitted 29 October, 2015; v1 submitted 28 February, 2013;
originally announced February 2013.
-
A Central Limit Theorem in Many-Body Quantum Dynamics
Authors:
Gerard Ben Arous,
Kay Kirkpatrick,
Benjamin Schlein
Abstract:
We study the many body quantum evolution of bosonic systems in the mean field limit. The dynamics is known to be well approximated by the Hartree equation. So far, the available results have the form of a law of large numbers. In this paper we go one step further and we show that the fluctuations around the Hartree evolution satisfy a central limit theorem. Interestingly, the variance of the limit…
▽ More
We study the many body quantum evolution of bosonic systems in the mean field limit. The dynamics is known to be well approximated by the Hartree equation. So far, the available results have the form of a law of large numbers. In this paper we go one step further and we show that the fluctuations around the Hartree evolution satisfy a central limit theorem. Interestingly, the variance of the limiting Gaussian distribution is determined by a time-dependent Bogoliubov transformation describing the dynamics of initial coherent states in a Fock space representation of the system.
△ Less
Submitted 26 March, 2012; v1 submitted 29 November, 2011;
originally announced November 2011.
-
A proof of the Lyons-Pemantle-Peres monotonicity conjecture for high biases
Authors:
Gerard Ben Arous,
Alexander Fribergh,
Vladas Sidoravicius
Abstract:
The speed $v(β)$ of a $β$-biased random walk on a Galton-Watson tree without leaves is increasing for $β\geq 717$.
The speed $v(β)$ of a $β$-biased random walk on a Galton-Watson tree without leaves is increasing for $β\geq 717$.
△ Less
Submitted 24 November, 2011;
originally announced November 2011.
-
Complexity of random smooth functions on the high-dimensional sphere
Authors:
Antonio Auffinger,
Gerard Ben Arous
Abstract:
We analyze the landscape of general smooth Gaussian functions on the sphere in dimension $N$, when $N$ is large. We give an explicit formula for the asymptotic complexity of the mean number of critical points of finite and diverging index at any level of energy and for the mean Euler characteristic of level sets. We then find two possible scenarios for the bottom landscape, one that has a layered…
▽ More
We analyze the landscape of general smooth Gaussian functions on the sphere in dimension $N$, when $N$ is large. We give an explicit formula for the asymptotic complexity of the mean number of critical points of finite and diverging index at any level of energy and for the mean Euler characteristic of level sets. We then find two possible scenarios for the bottom landscape, one that has a layered structure of critical values and a strong correlation between indexes and critical values and another where even at levels below the limiting ground state energy the mean number of local minima is exponentially large. We end the paper by discussing how these results can be interpreted in the language of spin glasses models.
△ Less
Submitted 16 December, 2013; v1 submitted 26 October, 2011;
originally announced October 2011.
-
Einstein relation for biased random walk on Galton--Watson trees
Authors:
Gerard Ben Arous,
Yueyun Hu,
Stefano Olla,
Ofer Zeitouni
Abstract:
We prove the Einstein relation, relating the velocity under a small perturbation to the diffusivity in equilibrium, for certain biased random walks on Galton--Watson trees. This provides the first example where the Einstein relation is proved for motion in random media with arbitrary deep traps.
We prove the Einstein relation, relating the velocity under a small perturbation to the diffusivity in equilibrium, for certain biased random walks on Galton--Watson trees. This provides the first example where the Einstein relation is proved for motion in random media with arbitrary deep traps.
△ Less
Submitted 22 December, 2011; v1 submitted 22 June, 2011;
originally announced June 2011.
-
On fluctuations of eigenvalues of random permutation matrices
Authors:
Gérard Ben Arous,
Kim Dang
Abstract:
Smooth linear statistics of random permutation matrices, sampled under a general Ewens distribution, exhibit an interesting non-universality phenomenon. Though they have bounded variance, their fluctuations are asymptotically non-Gaussian but infinitely divisible. The fluctuations are asymptotically Gaussian for less smooth linear statistics for which the variance diverges. The degree of smoothnes…
▽ More
Smooth linear statistics of random permutation matrices, sampled under a general Ewens distribution, exhibit an interesting non-universality phenomenon. Though they have bounded variance, their fluctuations are asymptotically non-Gaussian but infinitely divisible. The fluctuations are asymptotically Gaussian for less smooth linear statistics for which the variance diverges. The degree of smoothness is measured in terms of the quality of the trapezoidal approximations of the integral of the observable.
△ Less
Submitted 10 June, 2011;
originally announced June 2011.
-
Randomly biased walks on subcritical trees
Authors:
Gerard Ben Arous,
Alan Hammond
Abstract:
As a model of trapping by biased motion in random structure, we study the time taken for a biased random walk to return to the root of a subcritical Galton-Watson tree. We do so for trees in which these biases are randomly chosen, independently for distinct edges, according to a law that satisfies a logarithmic non-lattice condition. The mean return time of the walk is in essence given by the tota…
▽ More
As a model of trapping by biased motion in random structure, we study the time taken for a biased random walk to return to the root of a subcritical Galton-Watson tree. We do so for trees in which these biases are randomly chosen, independently for distinct edges, according to a law that satisfies a logarithmic non-lattice condition. The mean return time of the walk is in essence given by the total conductance of the tree. We determine the asymptotic decay of this total conductance, finding it to have a pure power-law decay. In the case of the conductance associated to a single vertex at maximal depth in the tree, this asymptotic decay may be analysed by the classical defective renewal theorem, due to the non-lattice edge-bias assumption.
However, the derivation of the decay for total conductance requires computing an additional constant multiple outside the power-law that allows for the contribution of all vertices close to the base of the tree. This computation entails a detailed study of a convenient decomposition of the tree, under conditioning on the tree having high total conductance. As such, our principal conclusion may be viewed as a development of renewal theory in the context of random environments.
For randomly biased random walk on a supercritical Galton-Watson tree with positive extinction probability, our main results may be regarded as a description of the slowdown mechanism caused by the presence of subcritical trees adjacent to the backbone that may act as traps that detain the walker. Indeed, this conclusion is exploited in \cite{GerardAlan} to obtain a stable limiting law for walker displacement in such a tree.
△ Less
Submitted 20 January, 2011;
originally announced January 2011.
-
Universality and extremal aging for dynamics of spin glasses on sub-exponential time scales
Authors:
G. Ben Arous,
O. Gun
Abstract:
We consider Random Hopping Time (RHT) dynamics of the Sherrington - Kirkpatrick (SK) model and p-spin models of spin glasses. For any of these models and for any inverse temperature we prove that, on time scales that are sub-exponential in the dimension, the properly scaled clock process (time-change process) of the dynamics converges to an extremal process. Moreover, on these time scales, the sys…
▽ More
We consider Random Hopping Time (RHT) dynamics of the Sherrington - Kirkpatrick (SK) model and p-spin models of spin glasses. For any of these models and for any inverse temperature we prove that, on time scales that are sub-exponential in the dimension, the properly scaled clock process (time-change process) of the dynamics converges to an extremal process. Moreover, on these time scales, the system exhibits aging like behavior which we called extremal aging. In other words, the dynamics of these models ages as the random energy model (REM) does. Hence, by extension, this confirms Bouchaud's REM-like trap model as a universal aging mechanism for a wide range of systems which, for the first time, includes the SK model.
△ Less
Submitted 25 October, 2010;
originally announced October 2010.
-
Extreme gaps between eigenvalues of random matrices
Authors:
Gérard Ben Arous,
Paul Bourgade
Abstract:
This paper studies the extreme gaps between eigenvalues of random matrices. We give the joint limiting law of the smallest gaps for Haar-distributed unitary matrices and matrices from the Gaussian unitary ensemble. In particular, the kth smallest gap, normalized by a factor $n^{-4/3}$, has a limiting density proportional to $x^{3k-1}e^{-x^3}$. Concerning the largest gaps, normalized by…
▽ More
This paper studies the extreme gaps between eigenvalues of random matrices. We give the joint limiting law of the smallest gaps for Haar-distributed unitary matrices and matrices from the Gaussian unitary ensemble. In particular, the kth smallest gap, normalized by a factor $n^{-4/3}$, has a limiting density proportional to $x^{3k-1}e^{-x^3}$. Concerning the largest gaps, normalized by $n/\sqrt{\log n}$, they converge in ${\mathrm{L}}^p$ to a constant for all $p>0$. These results are compared with the extreme gaps between zeros of the Riemann zeta function.
△ Less
Submitted 24 July, 2013; v1 submitted 6 October, 2010;
originally announced October 2010.
-
Random Matrices and complexity of Spin Glasses
Authors:
A. Auffinger,
G. Ben Arous,
J. Cerny
Abstract:
We give an asymptotic evaluation of the complexity of spherical p-spin spin-glass models via random matrix theory. This study enables us to obtain detailed information about the bottom of the energy landscape, including the absolute minimum (the ground state), the other local minima, and describe an interesting layered structure of the low critical values for the Hamiltonians of these models. We a…
▽ More
We give an asymptotic evaluation of the complexity of spherical p-spin spin-glass models via random matrix theory. This study enables us to obtain detailed information about the bottom of the energy landscape, including the absolute minimum (the ground state), the other local minima, and describe an interesting layered structure of the low critical values for the Hamiltonians of these models. We also show that our approach allows us to compute the related TAP-complexity and extend the results known in the physics literature. As an independent tool, we prove a LDP for the k-th largest eigenvalue of the GOE, extending the results of Ben Arous, Dembo and Guionnett (2001).
△ Less
Submitted 7 November, 2011; v1 submitted 4 March, 2010;
originally announced March 2010.
-
Current fluctuations for TASEP: A proof of the Prähofer--Spohn conjecture
Authors:
Gérard Ben Arous,
Ivan Corwin
Abstract:
We consider the family of two-sided Bernoulli initial conditions for TASEP which, as the left and right densities ($ρ_-,ρ_+$) are varied, give rise to shock waves and rarefaction fans---the two phenomena which are typical to TASEP. We provide a proof of Conjecture 7.1 of [Progr. Probab. 51 (2002) 185--204] which characterizes the order of and scaling functions for the fluctuations of the height fu…
▽ More
We consider the family of two-sided Bernoulli initial conditions for TASEP which, as the left and right densities ($ρ_-,ρ_+$) are varied, give rise to shock waves and rarefaction fans---the two phenomena which are typical to TASEP. We provide a proof of Conjecture 7.1 of [Progr. Probab. 51 (2002) 185--204] which characterizes the order of and scaling functions for the fluctuations of the height function of two-sided TASEP in terms of the two densities $ρ_-,ρ_+$ and the speed $y$ around which the height is observed. In proving this theorem for TASEP, we also prove a fluctuation theorem for a class of corner growth processes with external sources, or equivalently for the last passage time in a directed last passage percolation model with two-sided boundary conditions: $ρ_-$ and $1-ρ_+$. We provide a complete characterization of the order of and the scaling functions for the fluctuations of this model's last passage time $L(N,M)$ as a function of three parameters: the two boundary/source rates $ρ_-$ and $1-ρ_+$, and the scaling ratio $γ^2=M/N$. The proof of this theorem draws on the results of [Comm. Math. Phys. 265 (2006) 1--44] and extensively on the work of [Ann. Probab. 33 (2005) 1643--1697] on finite rank perturbations of Wishart ensembles in random matrix theory.
△ Less
Submitted 7 December, 2010; v1 submitted 19 May, 2009;
originally announced May 2009.
-
Free point processes and free extreme values
Authors:
G. Ben Arous,
V. Kargin
Abstract:
We continue here the study of free extreme values begun in Ben Arous and Voiculescu (2006). We study the convergence of the free point processes associated with free extreme values to a free Poisson random measure (Voiculescu (1998), Barndorff-Nielsen and Thorbjornsen (2005)). We relate this convergence to the free extremal laws introduced in Ben Arous and Voiculescu (2006) and give the limit la…
▽ More
We continue here the study of free extreme values begun in Ben Arous and Voiculescu (2006). We study the convergence of the free point processes associated with free extreme values to a free Poisson random measure (Voiculescu (1998), Barndorff-Nielsen and Thorbjornsen (2005)). We relate this convergence to the free extremal laws introduced in Ben Arous and Voiculescu (2006) and give the limit laws for free order statistics.
△ Less
Submitted 15 March, 2009;
originally announced March 2009.
-
Universality of REM-like aging in mean field spin glasses
Authors:
G. Ben Arous,
A. Bovier,
J. Cerny
Abstract:
Aging has become the paradigm to describe dynamical behavior of glassy systems, and in particular spin glasses. Trap models have been introduced as simple caricatures of effective dynamics of such systems. In this Letter we show that in a wide class of mean field models and on a wide range of time scales, aging occurs precisely as predicted by the REM-like trap model of Bouchaud and Dean. This i…
▽ More
Aging has become the paradigm to describe dynamical behavior of glassy systems, and in particular spin glasses. Trap models have been introduced as simple caricatures of effective dynamics of such systems. In this Letter we show that in a wide class of mean field models and on a wide range of time scales, aging occurs precisely as predicted by the REM-like trap model of Bouchaud and Dean. This is the first rigorous result about aging in mean field models except for the REM and the spherical model.
△ Less
Submitted 12 December, 2007;
originally announced December 2007.
-
Biased random walks on a Galton-Watson tree with leaves
Authors:
Gérard Ben Arous,
Alexander Fribergh,
Nina Gantert,
Alan Hammond
Abstract:
We consider a biased random walk $X_n$ on a Galton-Watson tree with leaves in the sub-ballistic regime. We prove that there exists an explicit constant $γ= γ(β) \in (0,1)$, depending on the bias $β$, such that $X_n$ is of order $n^γ$. Denoting $Δ_n$ the hitting time of level $n$, we prove that $Δ_n/n^{1/γ}$ is tight. Moreover we show that $Δ_n/n^{1/γ}$ does not converge in law (at least for large…
▽ More
We consider a biased random walk $X_n$ on a Galton-Watson tree with leaves in the sub-ballistic regime. We prove that there exists an explicit constant $γ= γ(β) \in (0,1)$, depending on the bias $β$, such that $X_n$ is of order $n^γ$. Denoting $Δ_n$ the hitting time of level $n$, we prove that $Δ_n/n^{1/γ}$ is tight. Moreover we show that $Δ_n/n^{1/γ}$ does not converge in law (at least for large values of $β$). We prove that along the sequences $n_λ(k)=\lfloor λβ^{γk}\rfloor$, $Δ_n/n^{1/γ}$ converges to certain infinitely divisible laws. Key tools for the proof are the classical Harris decomposition for Galton-Watson trees, a new variant of regeneration times and the careful analysis of triangular arrays of i.i.d. heavy-tailed random variables.
△ Less
Submitted 17 November, 2010; v1 submitted 23 November, 2007;
originally announced November 2007.
-
Poisson convergence for the largest eigenvalues of Heavy Tailed Random Matrices
Authors:
Antonio Auffinger,
Gerard Ben Arous,
Sandrine Peche
Abstract:
We study the statistics of the largest eigenvalues of real symmetric and sample covariance matrices when the entries are heavy tailed. Extending the result obtained by Soshnikov in \cite{Sos1}, we prove that, in the absence of the fourth moment, the top eigenvalues behave, in the limit, as the largest entries of the matrix.
We study the statistics of the largest eigenvalues of real symmetric and sample covariance matrices when the entries are heavy tailed. Extending the result obtained by Soshnikov in \cite{Sos1}, we prove that, in the absence of the fourth moment, the top eigenvalues behave, in the limit, as the largest entries of the matrix.
△ Less
Submitted 7 May, 2008; v1 submitted 16 October, 2007;
originally announced October 2007.
-
The spectrum of heavy-tailed random matrices
Authors:
Gerard Ben Arous,
Alice Guionnet
Abstract:
Let $X_N$ be an $N\ts N$ random symmetric matrix with independent equidistributed entries. If the law $P$ of the entries has a finite second moment, it was shown by Wigner \cite{wigner} that the empirical distribution of the eigenvalues of $X_N$, once renormalized by $\sqrt{N}$, converges almost surely and in expectation to the so-called semicircular distribution as $N$ goes to infinity. In this…
▽ More
Let $X_N$ be an $N\ts N$ random symmetric matrix with independent equidistributed entries. If the law $P$ of the entries has a finite second moment, it was shown by Wigner \cite{wigner} that the empirical distribution of the eigenvalues of $X_N$, once renormalized by $\sqrt{N}$, converges almost surely and in expectation to the so-called semicircular distribution as $N$ goes to infinity. In this paper we study the same question when $P$ is in the domain of attraction of an $α$-stable law. We prove that if we renormalize the eigenvalues by a constant $a_N$ of order $N^{\frac{1}α}$, the corresponding spectral distribution converges in expectation towards a law $μ_α$ which only depends on $α$. We characterize $μ_α$ and study some of its properties; it is a heavy-tailed probability measure which is absolutely continuous with respect to Lebesgue measure except possibly on a compact set of capacity zero.
△ Less
Submitted 14 July, 2007;
originally announced July 2007.