-
Pseudo-Maximum Likelihood Theory for High-Dimensional Rank One Inference
Authors:
Curtis Grant,
Aukosh Jagannath,
Justin Ko
Abstract:
We develop a pseudo-likelihood theory for rank one matrix estimation problems in the high dimensional limit. We prove a variational principle for the limiting pseudo-maximum likelihood which also characterizes the performance of the corresponding pseudo-maximum likelihood estimator. We show that this variational principle is universal and depends only on four parameters determined by the correspon…
▽ More
We develop a pseudo-likelihood theory for rank one matrix estimation problems in the high dimensional limit. We prove a variational principle for the limiting pseudo-maximum likelihood which also characterizes the performance of the corresponding pseudo-maximum likelihood estimator. We show that this variational principle is universal and depends only on four parameters determined by the corresponding null model. Through this universality, we introduce a notion of equivalence for estimation problems of this type and, in particular, show that a broad class of estimation tasks, including community detection, sparse submatrix detection, and non-linear spiked matrix models, are equivalent to spiked matrix models. As an application, we obtain a complete description of the performance of the least-squares (or ``best rank one'') estimator for any rank one matrix estimation problem.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
Local geometry of high-dimensional mixture models: Effective spectral theory and dynamical transitions
Authors:
Gerard Ben Arous,
Reza Gheissari,
Jiaoyang Huang,
Aukosh Jagannath
Abstract:
We study the local geometry of empirical risks in high dimensions via the spectral theory of their Hessian and information matrices. We focus on settings where the data, $(Y_\ell)_{\ell =1}^n\in \mathbb R^d$, are i.i.d. draws of a $k$-component Gaussian mixture model, and the loss depends on the projection of the data into a fixed number of vectors, namely $\mathbf{x}^\top Y$, where…
▽ More
We study the local geometry of empirical risks in high dimensions via the spectral theory of their Hessian and information matrices. We focus on settings where the data, $(Y_\ell)_{\ell =1}^n\in \mathbb R^d$, are i.i.d. draws of a $k$-component Gaussian mixture model, and the loss depends on the projection of the data into a fixed number of vectors, namely $\mathbf{x}^\top Y$, where $\mathbf{x}\in \mathbb{R}^{d\times C}$ are the parameters, and $C$ need not equal $k$. This setting captures a broad class of problems such as classification by one and two-layer networks and regression on multi-index models. We prove exact formulas for the limits of the empirical spectral distribution and outlier eigenvalues and eigenvectors of such matrices in the proportional asymptotics limit, where the number of samples and dimension $n,d\to\infty$ and $n/d=φ\in (0,\infty)$. These limits depend on the parameters $\mathbf{x}$ only through the summary statistic of the $(C+k)\times (C+k)$ Gram matrix of the parameters and class means, $\mathbf{G} = (\mathbf{x},\mathbfμ)^\top(\mathbf{x},\mathbfμ)$. It is known that under general conditions, when $\mathbf{x}$ is trained by stochastic gradient descent, the evolution of these same summary statistics along training converges to the solution of an autonomous system of ODEs, called the effective dynamics. This enables us to connect the spectral theory to the training dynamics. We demonstrate our general results by analyzing the effective spectrum along the effective dynamics in the case of multi-class logistic regression. In this setting, the empirical Hessian and information matrices have substantially different spectra, each with their own static and even dynamical spectral transitions.
△ Less
Submitted 15 May, 2025; v1 submitted 21 February, 2025;
originally announced February 2025.
-
Finding planted cliques using gradient descent
Authors:
Reza Gheissari,
Aukosh Jagannath,
Yiming Xu
Abstract:
The planted clique problem is a paradigmatic model of statistical-to-computational gaps: the planted clique is information-theoretically detectable if its size $k\ge 2\log_2 n$ but polynomial-time algorithms only exist for the recovery task when $k= Ω(\sqrt{n})$. By now, there are many algorithms that succeed as soon as $k = Ω(\sqrt{n})$. Glaringly, however, no black-box optimization method, e.g.,…
▽ More
The planted clique problem is a paradigmatic model of statistical-to-computational gaps: the planted clique is information-theoretically detectable if its size $k\ge 2\log_2 n$ but polynomial-time algorithms only exist for the recovery task when $k= Ω(\sqrt{n})$. By now, there are many algorithms that succeed as soon as $k = Ω(\sqrt{n})$. Glaringly, however, no black-box optimization method, e.g., gradient descent or the Metropolis process, has been shown to work. In fact, Chen, Mossel, and Zadik recently showed that any Metropolis process whose state space is the set of cliques fails to find any sub-linear sized planted clique in polynomial time if initialized naturally from the empty set. We show that using the method of Lagrange multipliers, namely optimizing the Hamiltonian given by the sum of the objective function and the clique constraint over the space of all subgraphs, succeeds. In particular, we prove that Markov chains which minimize this Hamiltonian (gradient descent and a low-temperature relaxation of it) succeed at recovering planted cliques of size $k = Ω(\sqrt{n})$ if initialized from the full graph. Importantly, initialized from the empty set, the relaxation still does not help the gradient descent find sub-linear planted cliques. We also demonstrate robustness of these Markov chain approaches under a natural contamination model.
△ Less
Submitted 15 May, 2025; v1 submitted 13 November, 2023;
originally announced November 2023.
-
Spectral alignment of stochastic gradient descent for high-dimensional classification tasks
Authors:
Gerard Ben Arous,
Reza Gheissari,
Jiaoyang Huang,
Aukosh Jagannath
Abstract:
We rigorously study the relation between the training dynamics via stochastic gradient descent (SGD) and the spectra of empirical Hessian and gradient matrices. We prove that in two canonical classification tasks for multi-class high-dimensional mixtures and either 1 or 2-layer neural networks, both the SGD trajectory and emergent outlier eigenspaces of the Hessian and gradient matrices align with…
▽ More
We rigorously study the relation between the training dynamics via stochastic gradient descent (SGD) and the spectra of empirical Hessian and gradient matrices. We prove that in two canonical classification tasks for multi-class high-dimensional mixtures and either 1 or 2-layer neural networks, both the SGD trajectory and emergent outlier eigenspaces of the Hessian and gradient matrices align with a common low-dimensional subspace. Moreover, in multi-layer settings this alignment occurs per layer, with the final layer's outlier eigenspace evolving over the course of training, and exhibiting rank deficiency when the SGD converges to sub-optimal classifiers. This establishes some of the rich predictions that have arisen from extensive numerical studies in the last decade about the spectra of Hessian and information matrices over the course of training in overparametrized networks.
△ Less
Submitted 15 May, 2025; v1 submitted 4 October, 2023;
originally announced October 2023.
-
Shattering in the Ising Pure $p$-Spin Model
Authors:
David Gamarnik,
Aukosh Jagannath,
Eren C. Kızıldağ
Abstract:
We study the Ising pure $p$-spin model for large $p$. We investigate the landscape of the Hamiltonian of this model. We show that for any $γ>0$ and any large enough $p$, the model exhibits an intricate geometrical property known as the multi Overlap Gap Property above the energy value $γ\sqrt{2\ln 2}$. We then show that for any inverse temperature $\sqrt{\ln 2}<β<\sqrt{2\ln 2}$ and any large $p$,…
▽ More
We study the Ising pure $p$-spin model for large $p$. We investigate the landscape of the Hamiltonian of this model. We show that for any $γ>0$ and any large enough $p$, the model exhibits an intricate geometrical property known as the multi Overlap Gap Property above the energy value $γ\sqrt{2\ln 2}$. We then show that for any inverse temperature $\sqrt{\ln 2}<β<\sqrt{2\ln 2}$ and any large $p$, the model exhibits shattering: w.h.p. as $n\to\infty$, there exists exponentially many well-separated clusters such that (a) each cluster has exponentially small Gibbs mass, and (b) the clusters collectively contain all but a vanishing fraction of Gibbs mass. Moreover, these clusters consist of configurations with energy near $β$. Range of temperatures for which shattering occurs is within the replica symmetric region. To the best of our knowledge, this is the first shattering result regarding the Ising $p$-spin models. Our proof is elementary, and in particular based on simple applications of the first and the second moment methods.
△ Less
Submitted 14 July, 2023;
originally announced July 2023.
-
Existence of the free energy for heavy-tailed spin glasses
Authors:
Aukosh Jagannath,
Patrick Lopatto
Abstract:
We study the free energy of a mean-field spin glass whose coupling distribution has power law tails. Under the assumption that the couplings have infinite variance and finite mean, we show that the thermodynamic limit of the quenched free energy exists, and that the free energy is self-averaging.
We study the free energy of a mean-field spin glass whose coupling distribution has power law tails. Under the assumption that the couplings have infinite variance and finite mean, we show that the thermodynamic limit of the quenched free energy exists, and that the free energy is self-averaging.
△ Less
Submitted 4 December, 2022; v1 submitted 17 November, 2022;
originally announced November 2022.
-
Differentially private multivariate medians
Authors:
Kelly Ramsay,
Aukosh Jagannath,
Shoja'eddin Chenouri
Abstract:
Statistical tools which satisfy rigorous privacy guarantees are necessary for modern data analysis. It is well-known that robustness against contamination is linked to differential privacy. Despite this fact, using multivariate medians for differentially private and robust multivariate location estimation has not been systematically studied. We develop novel finite-sample performance guarantees fo…
▽ More
Statistical tools which satisfy rigorous privacy guarantees are necessary for modern data analysis. It is well-known that robustness against contamination is linked to differential privacy. Despite this fact, using multivariate medians for differentially private and robust multivariate location estimation has not been systematically studied. We develop novel finite-sample performance guarantees for differentially private multivariate depth-based medians, which are essentially sharp. Our results cover commonly used depth functions, such as the halfspace (or Tukey) depth, spatial depth, and the integrated dual depth. We show that under Cauchy marginals, the cost of heavy-tailed location estimation outweighs the cost of privacy. We demonstrate our results numerically using a Gaussian contamination model in dimensions up to d = 100, and compare them to a state-of-the-art private mean estimation algorithm. As a by-product of our investigation, we prove concentration inequalities for the output of the exponential mechanism about the maximizer of the population objective function. This bound applies to objective functions that satisfy a mild regularity condition.
△ Less
Submitted 26 March, 2024; v1 submitted 12 October, 2022;
originally announced October 2022.
-
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling
Authors:
Gerard Ben Arous,
Reza Gheissari,
Aukosh Jagannath
Abstract:
We study the scaling limits of stochastic gradient descent (SGD) with constant step-size in the high-dimensional regime. We prove limit theorems for the trajectories of summary statistics (i.e., finite-dimensional functions) of SGD as the dimension goes to infinity. Our approach allows one to choose the summary statistics that are tracked, the initialization, and the step-size. It yields both ball…
▽ More
We study the scaling limits of stochastic gradient descent (SGD) with constant step-size in the high-dimensional regime. We prove limit theorems for the trajectories of summary statistics (i.e., finite-dimensional functions) of SGD as the dimension goes to infinity. Our approach allows one to choose the summary statistics that are tracked, the initialization, and the step-size. It yields both ballistic (ODE) and diffusive (SDE) limits, with the limit depending dramatically on the former choices. We show a critical scaling regime for the step-size, below which the effective ballistic dynamics matches gradient flow for the population loss, but at which, a new correction term appears which changes the phase diagram. About the fixed points of this effective dynamics, the corresponding diffusive limits can be quite complex and even degenerate. We demonstrate our approach on popular examples including estimation for spiked matrix and tensor models and classification via two-layer networks for binary and XOR-type Gaussian mixture models. These examples exhibit surprising phenomena including multimodal timescales to convergence as well as convergence to sub-optimal solutions with probability bounded away from zero from random (e.g., Gaussian) initializations. At the same time, we demonstrate the benefit of overparametrization by showing that the latter probability goes to zero as the second layer width grows.
△ Less
Submitted 17 August, 2023; v1 submitted 8 June, 2022;
originally announced June 2022.
-
Circuit Lower Bounds for the p-Spin Optimization Problem
Authors:
David Gamarnik,
Aukosh Jagannath,
Alexander S. Wein
Abstract:
We consider the problem of finding a near ground state of a $p$-spin model with Rademacher couplings by means of a low-depth circuit. As a direct extension of the authors' recent work [Gamarnik, Jagannath, Wein 2020], we establish that any poly-size $n$-output circuit that produces a spin assignment with objective value within a certain constant factor of optimality, must have depth at least…
▽ More
We consider the problem of finding a near ground state of a $p$-spin model with Rademacher couplings by means of a low-depth circuit. As a direct extension of the authors' recent work [Gamarnik, Jagannath, Wein 2020], we establish that any poly-size $n$-output circuit that produces a spin assignment with objective value within a certain constant factor of optimality, must have depth at least $\log n/(2\log\log n)$ as $n$ grows. This is stronger than the known state of the art bounds of the form $Ω(\log n/(k(n)\log\log n))$ for similar combinatorial optimization problems, where $k(n)$ depends on the optimality value. For example, for the largest clique problem $k(n)$ corresponds to the square of the size of the clique [Rossman 2010]. At the same time our results are not quite comparable since in our case the circuits are required to produce a solution itself rather than solving the associated decision problem. As in our earlier work, the approach is based on the overlap gap property (OGP) exhibited by random $p$-spin models, but the derivation of the circuit lower bound relies further on standard facts from Fourier analysis on the Boolean cube, in particular the Linial-Mansour-Nisan Theorem.
To the best of our knowledge, this is the first instance when methods from spin glass theory have ramifications for circuit complexity.
△ Less
Submitted 21 January, 2022; v1 submitted 3 September, 2021;
originally announced September 2021.
-
A simple construction of the dynamical $Φ^4_3$ model
Authors:
Aukosh Jagannath,
Nicolas Perkowski
Abstract:
The $Φ^4_3$ equation is a singular stochastic PDE with important applications in mathematical physics. Its solution usually requires advanced mathematical theories like regularity structures or paracontrolled distributions, and even local well-posedness is highly nontrivial. Here we propose a multiplicative transformation to reduce the periodic $Φ^4_3$ equation to a well-posed random PDE. This lea…
▽ More
The $Φ^4_3$ equation is a singular stochastic PDE with important applications in mathematical physics. Its solution usually requires advanced mathematical theories like regularity structures or paracontrolled distributions, and even local well-posedness is highly nontrivial. Here we propose a multiplicative transformation to reduce the periodic $Φ^4_3$ equation to a well-posed random PDE. This leads to a simple and elementary proof of global well-posedness, which only relies on Schauder estimates, the maximum principle, and basic estimates for paraproducts, and in particular does not need regularity structures or paracontrolled distributions.
△ Less
Submitted 30 August, 2021;
originally announced August 2021.
-
Shattering Versus Metastability in Spin Glasses
Authors:
Gérard Ben Arous,
Aukosh Jagannath
Abstract:
Our goal in this work is to better understand the relationship between replica symmetry breaking, shattering, and metastability. To this end, we study the static and dynamic behaviour of spherical pure $p$-spin glasses above the replica symmetry breaking temperature $T_{s}$. In this regime, we find that there are at least two distinct temperatures related to non-trivial behaviour. First we prove t…
▽ More
Our goal in this work is to better understand the relationship between replica symmetry breaking, shattering, and metastability. To this end, we study the static and dynamic behaviour of spherical pure $p$-spin glasses above the replica symmetry breaking temperature $T_{s}$. In this regime, we find that there are at least two distinct temperatures related to non-trivial behaviour. First we prove that there is a regime of temperatures in which the spherical $p$-spin model exhibits a shattering phase. Our results holds in a regime above but near $T_s$. We then find that metastable states exist up to an even higher temperature $T_{BBM}$ as predicted by Barrat--Burioni--Mézard which is expected to be higher than the phase boundary for the shattering phase $T_d <T_{BBM}$. We develop this work by first developing a Thouless--Anderson--Palmer decomposition which builds on the work of Subag. We then present a series of questions and conjectures regarding the sharp phase boundaries for shattering and slow mixing.
△ Less
Submitted 16 April, 2021;
originally announced April 2021.
-
Hardness of Random Optimization Problems for Boolean Circuits, Low-Degree Polynomials, and Langevin Dynamics
Authors:
David Gamarnik,
Aukosh Jagannath,
Alexander S. Wein
Abstract:
We consider the problem of finding nearly optimal solutions of optimization problems with random objective functions. Two concrete problems we consider are (a) optimizing the Hamiltonian of a spherical or Ising $p$-spin glass model, and (b) finding a large independent set in a sparse Erdős-Rényi graph. The following families of algorithms are considered: (a) low-degree polynomials of the input; (b…
▽ More
We consider the problem of finding nearly optimal solutions of optimization problems with random objective functions. Two concrete problems we consider are (a) optimizing the Hamiltonian of a spherical or Ising $p$-spin glass model, and (b) finding a large independent set in a sparse Erdős-Rényi graph. The following families of algorithms are considered: (a) low-degree polynomials of the input; (b) low-depth Boolean circuits; (c) the Langevin dynamics algorithm. We show that these families of algorithms fail to produce nearly optimal solutions with high probability. For the case of Boolean circuits, our results improve the state-of-the-art bounds known in circuit complexity theory (although we consider the search problem as opposed to the decision problem).
Our proof uses the fact that these models are known to exhibit a variant of the overlap gap property (OGP) of near-optimal solutions. Specifically, for both models, every two solutions whose objectives are above a certain threshold are either close or far from each other. The crux of our proof is that the classes of algorithms we consider exhibit a form of stability. We show by an interpolation argument that stable algorithms cannot overcome the OGP barrier.
The stability of Langevin dynamics is an immediate consequence of the well-posedness of stochastic differential equations. The stability of low-degree polynomials and Boolean circuits is established using tools from Gaussian and Boolean analysis -- namely hypercontractivity and total influence, as well as a novel lower bound for random walks avoiding certain subsets. In the case of Boolean circuits, the result also makes use of Linal-Mansour-Nisan's classical theorem. Our techniques apply more broadly to low influence functions and may apply more generally.
△ Less
Submitted 26 January, 2022; v1 submitted 25 April, 2020;
originally announced April 2020.
-
Online stochastic gradient descent on non-convex losses from high-dimensional inference
Authors:
Gerard Ben Arous,
Reza Gheissari,
Aukosh Jagannath
Abstract:
Stochastic gradient descent (SGD) is a popular algorithm for optimization problems arising in high-dimensional inference tasks. Here one produces an estimator of an unknown parameter from independent samples of data by iteratively optimizing a loss function. This loss function is random and often non-convex. We study the performance of the simplest version of SGD, namely online SGD, from a random…
▽ More
Stochastic gradient descent (SGD) is a popular algorithm for optimization problems arising in high-dimensional inference tasks. Here one produces an estimator of an unknown parameter from independent samples of data by iteratively optimizing a loss function. This loss function is random and often non-convex. We study the performance of the simplest version of SGD, namely online SGD, from a random start in the setting where the parameter space is high-dimensional.
We develop nearly sharp thresholds for the number of samples needed for consistent estimation as one varies the dimension. Our thresholds depend only on an intrinsic property of the population loss which we call the information exponent. In particular, our results do not assume uniform control on the loss itself, such as convexity or uniform derivative bounds. The thresholds we obtain are polynomial in the dimension and the precise exponent depends explicitly on the information exponent. As a consequence of our results, we find that except for the simplest tasks, almost all of the data is used simply in the initial search phase to obtain non-trivial correlation with the ground truth. Upon attaining non-trivial correlation, the descent is rapid and exhibits law of large numbers type behavior.
We illustrate our approach by applying it to a wide set of inference tasks such as phase retrieval, and parameter estimation for generalized linear models, online PCA, and spiked tensor models, as well as to supervised learning for single-layer networks with general activation functions.
△ Less
Submitted 10 May, 2021; v1 submitted 23 March, 2020;
originally announced March 2020.
-
The Overlap Gap Property and Approximate Message Passing Algorithms for $p$-spin models
Authors:
David Gamarnik,
Aukosh Jagannath
Abstract:
We consider the algorithmic problem of finding a near ground state (near optimal solution) of a $p$-spin model. We show that for a class of algorithms broadly defined as Approximate Message Passing (AMP), the presence of the Overlap Gap Property (OGP), appropriately defined, is a barrier. We conjecture that when $p\ge 4$ the model does indeed exhibits OGP (and prove it for the space of binary solu…
▽ More
We consider the algorithmic problem of finding a near ground state (near optimal solution) of a $p$-spin model. We show that for a class of algorithms broadly defined as Approximate Message Passing (AMP), the presence of the Overlap Gap Property (OGP), appropriately defined, is a barrier. We conjecture that when $p\ge 4$ the model does indeed exhibits OGP (and prove it for the space of binary solutions). Assuming the validity of this conjecture, as an implication, the AMP fails to find near ground states in these models, per our result. We extend our result to the problem of finding pure states by means of Thouless, Anderson and Palmer (TAP) based iterations, which is yet another example of AMP type algorithms. We show that such iterations fail to find pure states approximately, subject to the conjecture that the space of pure states exhibits the OGP, appropriately stated, when $p\ge 4$.
△ Less
Submitted 25 November, 2019; v1 submitted 15 November, 2019;
originally announced November 2019.
-
The Overlap Gap Property in Principal Submatrix Recovery
Authors:
David Gamarnik,
Aukosh Jagannath,
Subhabrata Sen
Abstract:
We study support recovery for a $k \times k$ principal submatrix with elevated mean $λ/N$, hidden in an $N\times N$ symmetric mean zero Gaussian matrix. Here $λ>0$ is a universal constant, and we assume $k = N ρ$ for some constant $ρ\in (0,1)$. We establish that {there exists a constant $C>0$ such that} the MLE recovers a constant proportion of the hidden submatrix if…
▽ More
We study support recovery for a $k \times k$ principal submatrix with elevated mean $λ/N$, hidden in an $N\times N$ symmetric mean zero Gaussian matrix. Here $λ>0$ is a universal constant, and we assume $k = N ρ$ for some constant $ρ\in (0,1)$. We establish that {there exists a constant $C>0$ such that} the MLE recovers a constant proportion of the hidden submatrix if $λ{\geq C} \sqrt{\frac{1}ρ \log \frac{1}ρ}$, {while such recovery is information theoretically impossible if $λ= o( \sqrt{\frac{1}ρ \log \frac{1}ρ} )$}. The MLE is computationally intractable in general, and in fact, for $ρ>0$ sufficiently small, this problem is conjectured to exhibit a \emph{statistical-computational gap}. To provide rigorous evidence for this, we study the likelihood landscape for this problem, and establish that for some $\varepsilon>0$ and $\sqrt{\frac{1}ρ \log \frac{1}ρ } \ll λ\ll \frac{1}{ρ^{1/2 + \varepsilon}}$, the problem exhibits a variant of the \emph{Overlap-Gap-Property (OGP)}. As a direct consequence, we establish that a family of local MCMC based algorithms do not achieve optimal recovery. Finally, we establish that for $λ> 1/ρ$, a simple spectral method recovers a constant proportion of the hidden submatrix.
△ Less
Submitted 12 December, 2020; v1 submitted 26 August, 2019;
originally announced August 2019.
-
Statistical thresholds for Tensor PCA
Authors:
Aukosh Jagannath,
Patrick Lopatto,
Leo Miolane
Abstract:
We study the statistical limits of testing and estimation for a rank one deformation of a Gaussian random tensor. We compute the sharp thresholds for hypothesis testing and estimation by maximum likelihood and show that they are the same. Furthermore, we find that the maximum likelihood estimator achieves the maximal correlation with the planted vector among measurable estimators above the estimat…
▽ More
We study the statistical limits of testing and estimation for a rank one deformation of a Gaussian random tensor. We compute the sharp thresholds for hypothesis testing and estimation by maximum likelihood and show that they are the same. Furthermore, we find that the maximum likelihood estimator achieves the maximal correlation with the planted vector among measurable estimators above the estimation threshold. In this setting, the maximum likelihood estimator exhibits a discontinuous BBP-type transition: below the critical threshold the estimator is orthogonal to the planted vector, but above the critical threshold, it achieves positive correlation which is uniformly bounded away from zero.
△ Less
Submitted 8 August, 2019; v1 submitted 8 December, 2018;
originally announced December 2018.
-
Bounding flows for spherical spin glass dynamics
Authors:
Gerard Ben Arous,
Reza Gheissari,
Aukosh Jagannath
Abstract:
We introduce a new approach to studying spherical spin glass dynamics based on differential inequalities for one-time observables. Using this approach, we obtain an approximate phase diagram for the evolution of the energy $H$ and its gradient under Langevin dynamics for spherical $p$-spin models. We then derive several consequences of this phase diagram. For example, at any temperature, uniformly…
▽ More
We introduce a new approach to studying spherical spin glass dynamics based on differential inequalities for one-time observables. Using this approach, we obtain an approximate phase diagram for the evolution of the energy $H$ and its gradient under Langevin dynamics for spherical $p$-spin models. We then derive several consequences of this phase diagram. For example, at any temperature, uniformly over all starting points, the process must reach and remain in an absorbing region of large negative values of $H$ and large (in norm) gradients in order 1 time. Furthermore, if the process starts in a neighborhood of a critical point of $H$ with negative energy, then both the gradient and energy must increase macroscopically under this evolution, even if this critical point is a saddle with index of order $N$. As a key technical tool, we estimate Sobolev norms of spin glass Hamiltonians, which are of independent interest.
△ Less
Submitted 24 October, 2019; v1 submitted 2 August, 2018;
originally announced August 2018.
-
Algorithmic thresholds for tensor PCA
Authors:
Gerard Ben Arous,
Reza Gheissari,
Aukosh Jagannath
Abstract:
We study the algorithmic thresholds for principal component analysis of Gaussian $k$-tensors with a planted rank-one spike, via Langevin dynamics and gradient descent. In order to efficiently recover the spike from natural initializations, the signal to noise ratio must diverge in the dimension. Our proof shows that the mechanism for the success/failure of recovery is the strength of the "curvatur…
▽ More
We study the algorithmic thresholds for principal component analysis of Gaussian $k$-tensors with a planted rank-one spike, via Langevin dynamics and gradient descent. In order to efficiently recover the spike from natural initializations, the signal to noise ratio must diverge in the dimension. Our proof shows that the mechanism for the success/failure of recovery is the strength of the "curvature" of the spike on the maximum entropy region of the initial data. To demonstrate this, we study the dynamics on a generalized family of high-dimensional landscapes with planted signals, containing the spiked tensor models as specific instances. We identify thresholds of signal-to-noise ratios above which order 1 time recovery succeeds; in the case of the spiked tensor model these match the thresholds conjectured for algorithms such as Approximate Message Passing. Below these thresholds, where the curvature of the signal on the maximal entropy region is weak, we show that recovery from certain natural initializations takes at least stretched exponential time. Our approach combines global regularity estimates for spin glasses with point-wise estimates, to study the recovery problem by a perturbative approach.
△ Less
Submitted 10 September, 2019; v1 submitted 2 August, 2018;
originally announced August 2018.
-
On spin distributions for generic p-spin models
Authors:
Antonio Auffinger,
Aukosh Jagannath
Abstract:
We provide an alternative formula for spin distributions of generic p-spin glass models. As a main application of this expression, we write spin statistics as solutions of partial differential equations and we show that the generic p-spin models satisfy multiscale Thouless-Anderson-Palmer equations as originally predicted in the work of Mezard-Virasoro.
We provide an alternative formula for spin distributions of generic p-spin glass models. As a main application of this expression, we write spin statistics as solutions of partial differential equations and we show that the generic p-spin models satisfy multiscale Thouless-Anderson-Palmer equations as originally predicted in the work of Mezard-Virasoro.
△ Less
Submitted 22 February, 2018; v1 submitted 20 February, 2018;
originally announced February 2018.
-
On the unbalanced cut problem and the generalized Sherrington-Kirkpatrick model
Authors:
Aukosh Jagannath,
Subhabrata Sen
Abstract:
We establish a strict asymptotic inequality between a class of graph partition problems on the sparse End\H{o]s-Rényi and random regular graph ensembles with the same average degree. Along the way, we establish a variational representation for the ground state energy for generalized mixed $p$-spin glasses and derive strict comparison inequalities for such models as the alphabet changes.
We establish a strict asymptotic inequality between a class of graph partition problems on the sparse End\H{o]s-Rényi and random regular graph ensembles with the same average degree. Along the way, we establish a variational representation for the ground state energy for generalized mixed $p$-spin glasses and derive strict comparison inequalities for such models as the alphabet changes.
△ Less
Submitted 2 September, 2020; v1 submitted 27 July, 2017;
originally announced July 2017.
-
Spectral gap estimates in mean field spin glasses
Authors:
Gérard Ben Arous,
Aukosh Jagannath
Abstract:
We show that mixing for local, reversible dynamics of mean field spin glasses is exponentially slow in the low temperature regime. We introduce a notion of free energy barriers for the overlap, and prove that their existence imply that the spectral gap is exponentially small, and thus that mixing is exponentially slow. We then exhibit sufficient conditions on the equilibrium Gibbs measure which gu…
▽ More
We show that mixing for local, reversible dynamics of mean field spin glasses is exponentially slow in the low temperature regime. We introduce a notion of free energy barriers for the overlap, and prove that their existence imply that the spectral gap is exponentially small, and thus that mixing is exponentially slow. We then exhibit sufficient conditions on the equilibrium Gibbs measure which guarantee the existence of these barriers, using the notion of replicon eigenvalue and 2D Guerra Talagrand bounds. We show how these sufficient conditions cover large classes of Ising spin models for reversible nearest-neighbor dynamics and spherical models for Langevin dynamics. Finally, in the case of Ising spins, Panchenko's recent rigorous calculation [79] of the free energy for a system of "two real replica" enables us to prove a quenched LDP for the overlap distribution, which gives us a wider criterion for slow mixing directly related to the Franz-Parisi-Virasoro approach [43,60]. This condition holds in a wider range of temperatures.
△ Less
Submitted 2 March, 2018; v1 submitted 11 May, 2017;
originally announced May 2017.
-
A connection between MAX $κ$-CUT and the inhomogeneous Potts spin glass in the large degree limit
Authors:
Aukosh Jagannath,
Justin Ko,
Subhabrata Sen
Abstract:
We study the asymptotic behavior of the Max $κ$-cut on a family of sparse, inhomogeneous random graphs. In the large degree limit, the leading term is a variational problem, involving the ground state of a constrained inhomogeneous Potts spin glass. We derive a Parisi type formula for the free energy of this model, with possible constraints on the proportions, and derive the limiting ground state…
▽ More
We study the asymptotic behavior of the Max $κ$-cut on a family of sparse, inhomogeneous random graphs. In the large degree limit, the leading term is a variational problem, involving the ground state of a constrained inhomogeneous Potts spin glass. We derive a Parisi type formula for the free energy of this model, with possible constraints on the proportions, and derive the limiting ground state energy by a suitable zero temperature limit.
△ Less
Submitted 9 March, 2017;
originally announced March 2017.
-
Random matrices and the New York City subway system
Authors:
Aukosh Jagannath,
Thomas Trogdon
Abstract:
We analyze subway arrival times in the New York City subway system. We find regimes where the gaps between trains exhibit both (unitarily invariant) random matrix statistics and Poisson statistics. The departure from random matrix statistics is captured by the value of the Coulomb potential along the subway route. This departure becomes more pronounced as trains make more stops.
We analyze subway arrival times in the New York City subway system. We find regimes where the gaps between trains exhibit both (unitarily invariant) random matrix statistics and Poisson statistics. The departure from random matrix statistics is captured by the value of the Coulomb potential along the subway route. This departure becomes more pronounced as trains make more stops.
△ Less
Submitted 7 March, 2017;
originally announced March 2017.
-
Thouless-Anderson-Palmer equations for the generic p-spin glass model
Authors:
Antonio Auffinger,
Aukosh Jagannath
Abstract:
We study the Thouless-Anderson-Palmer (TAP) equations for spin glasses on the hypercube. First, using a random, approximately ultrametric decomposition of the hypercube, we decompose the Gibbs measure, $\langle\cdot\rangle_N$, into a mixture of conditional laws, $\langle\cdot\rangle_{α,N}$. We show that the TAP equations hold for the spin at any site with respect to $\langle\cdot\rangle_{α,N}$ sim…
▽ More
We study the Thouless-Anderson-Palmer (TAP) equations for spin glasses on the hypercube. First, using a random, approximately ultrametric decomposition of the hypercube, we decompose the Gibbs measure, $\langle\cdot\rangle_N$, into a mixture of conditional laws, $\langle\cdot\rangle_{α,N}$. We show that the TAP equations hold for the spin at any site with respect to $\langle\cdot\rangle_{α,N}$ simultaneously for all $α$. This result holds for generic models provided that the Parisi measure of the model has a jump at the top of its support.
△ Less
Submitted 21 February, 2018; v1 submitted 19 December, 2016;
originally announced December 2016.
-
On the Spectral Gap of Spherical Spin Glass Dynamics
Authors:
Reza Gheissari,
Aukosh Jagannath
Abstract:
We consider the time to equilibrium for the Langevin dynamics of the spherical $p$-spin glass model of system size $N$. We show that the log-Sobolev constant and spectral gap are order $1$ in $N$ at sufficiently high temperature whereas the spectral gap decays exponentially in $N$ at sufficiently low temperatures. These verify the existence of a dynamical high temperature phase and a dynamical gla…
▽ More
We consider the time to equilibrium for the Langevin dynamics of the spherical $p$-spin glass model of system size $N$. We show that the log-Sobolev constant and spectral gap are order $1$ in $N$ at sufficiently high temperature whereas the spectral gap decays exponentially in $N$ at sufficiently low temperatures. These verify the existence of a dynamical high temperature phase and a dynamical glass phase at the level of the spectral gap. Key to these results are the understanding of the extremal process and restricted free energy of Subag--Zeitouni and Subag.
△ Less
Submitted 4 June, 2018; v1 submitted 23 August, 2016;
originally announced August 2016.
-
Bounding the Complexity of Replica Symmetry Breaking for Spherical Spin Glasses
Authors:
Aukosh Jagannath,
Ian Tobasco
Abstract:
In this paper, we study the Crisanti-Sommers variational problem, which is a variational formula for the free energy of spherical mixed $p$-spin glasses. We begin by computing the dual of this problem using a min-max argument. We find that the dual is a 1-D problem of obstacle type, where the obstacle is related to the covariance structure of the underlying process. This approach yields an alterna…
▽ More
In this paper, we study the Crisanti-Sommers variational problem, which is a variational formula for the free energy of spherical mixed $p$-spin glasses. We begin by computing the dual of this problem using a min-max argument. We find that the dual is a 1-D problem of obstacle type, where the obstacle is related to the covariance structure of the underlying process. This approach yields an alternative way to understand Replica Symmetry Breaking at the level of the variational problem through topological properties of the coincidence set of the optimal dual variable. Using this duality, we give an algorithm to reduce this a priori infinite dimensional variational problem to a finite dimensional one, thereby confining all possible forms of Replica Symmetry Breaking in these models to a finite parameter family. These results complement the authors' related results for the low temperature $Γ$-limit of this variational problem. We briefly discuss the analysis of the Replica Symmetric phase using this approach.
△ Less
Submitted 7 July, 2016;
originally announced July 2016.
-
Low Temperature Asymptotics in Spherical Mean Field Spin Glasses
Authors:
Aukosh Jagannath,
Ian Tobasco
Abstract:
In this paper, we study the low temperature limit of the spherical Crisanti-Sommers variational problem. We identify the $Γ$-limit of the Crisanti-Sommers functionals, thereby establishing a rigorous variational problem for the ground state energy of spherical mixed $p$-spin glasses. As an application, we compute moderate deviations of the corresponding minimizers in the low temperature limit. In…
▽ More
In this paper, we study the low temperature limit of the spherical Crisanti-Sommers variational problem. We identify the $Γ$-limit of the Crisanti-Sommers functionals, thereby establishing a rigorous variational problem for the ground state energy of spherical mixed $p$-spin glasses. As an application, we compute moderate deviations of the corresponding minimizers in the low temperature limit. In particular, for a large class of models this yields moderate deviations for the overlap distribution. We then analyze the ground state energy problem. We show that this variational problem is dual to an obstacle-type problem. This duality is at the heart of our analysis. We present the regularity theory of the optimizers of the primal and dual problems. This culminates in a simple method for constructing a finite dimensional space in which these optimizers live for any model. As a consequence of these results, we unify independent predictions of Crisanti-Leuzzi and Auffinger-Ben Arous regarding the 1RSB phase in this limit. We find that the "positive replicon eigenvalue" and "pure-like" conditions are together necessary for optimality, but that neither are themselves sufficient, answering a question of Auffinger and Ben Arous in the negative. We end by proving that these conditions completely characterize the 1RSB phase in $2+p$-spin models.
△ Less
Submitted 1 February, 2016;
originally announced February 2016.
-
On the overlap distribution of branching random walks
Authors:
Aukosh Jagannath
Abstract:
In this paper, we study the overlap distribution and Gibbs measure of the Branching Random Walk with Gaussian increments on a binary tree. We first prove that the Branching Random Walk is 1 step Replica Symmetry Breaking and give a precise form for its overlap distribution, verifying a prediction of Derrida and Spohn. We then prove that the Gibbs measure of this system satisfies the Ghirlanda-Guer…
▽ More
In this paper, we study the overlap distribution and Gibbs measure of the Branching Random Walk with Gaussian increments on a binary tree. We first prove that the Branching Random Walk is 1 step Replica Symmetry Breaking and give a precise form for its overlap distribution, verifying a prediction of Derrida and Spohn. We then prove that the Gibbs measure of this system satisfies the Ghirlanda-Guerra identities. As a consequence, the limiting Gibbs measure has Poisson-Dirichlet statistics. The main technical result is a proof that the overlap distribution for the Branching Random Walk is supported on the set $\{0,1\}$.
△ Less
Submitted 10 June, 2017; v1 submitted 24 September, 2015;
originally announced September 2015.
-
Some Properties of the Phase Diagram for Mixed $p$-Spin Glasses
Authors:
Aukosh Jagannath,
Ian Tobasco
Abstract:
In this paper we study the Parisi variational problem for mixed $p$-spin glasses with Ising spins. Our starting point is a characterization of Parisi measures whose origin lies in the first order optimality conditions for the Parisi functional, which is known to be strictly convex. Using this characterization, we study the phase diagram in the temperature-external field plane. We begin by deriving…
▽ More
In this paper we study the Parisi variational problem for mixed $p$-spin glasses with Ising spins. Our starting point is a characterization of Parisi measures whose origin lies in the first order optimality conditions for the Parisi functional, which is known to be strictly convex. Using this characterization, we study the phase diagram in the temperature-external field plane. We begin by deriving self-consistency conditions for Parisi measures that generalize those of de Almeida and Thouless to all levels of Replica Symmetry Breaking (RSB) and all models. As a consequence, we conjecture that for all models the Replica Symmetric (RS) phase is the region determined by the natural analogue of the de Almeida-Thouless condition. We show that for all models, the complement of this region is in the RSB phase. Furthermore, we show that the conjectured phase boundary is exactly the phase boundary in the plane less a bounded set. In the case of the Sherrington-Kirkpatrick model, we extend this last result to show that this bounded set does not contain the critical point at zero external field.
△ Less
Submitted 22 December, 2015; v1 submitted 10 April, 2015;
originally announced April 2015.
-
A Dynamic Programming Approach to the Parisi Functional
Authors:
Aukosh Jagannath,
Ian Tobasco
Abstract:
G.Parisi predicted an important variational formula for the thermodynamic limit of the intensive free energy for a class of mean field spin glasses. In this paper, we present an elementary approach to the study of the Parisi functional using stochastic dynamic programing and semi-linear PDE. We give a derivation of important properties of the Parisi PDE avoiding the use of Ruelle Probability Casca…
▽ More
G.Parisi predicted an important variational formula for the thermodynamic limit of the intensive free energy for a class of mean field spin glasses. In this paper, we present an elementary approach to the study of the Parisi functional using stochastic dynamic programing and semi-linear PDE. We give a derivation of important properties of the Parisi PDE avoiding the use of Ruelle Probability Cascades and Cole-Hopf transformations. As an application, we give a simple proof of the strict convexity of the Parisi functional, which was recently proved by Auffinger and Chen in [2].
△ Less
Submitted 30 November, 2015; v1 submitted 15 February, 2015;
originally announced February 2015.
-
Approximate Ultrametricity for Random Measures and Applications to Spin Glasses
Authors:
Aukosh Jagannath
Abstract:
In this paper, we introduce a notion called "Approximate Ultrametricity" which encapsulates the phenomenology of a sequence of random probability measures having supports that behave like ultrametric spaces insofar as they decompose into nested balls. We provide a sufficient condition for a sequence of random probability measures on the unit ball of an infinite dimensional separable Hilbert space…
▽ More
In this paper, we introduce a notion called "Approximate Ultrametricity" which encapsulates the phenomenology of a sequence of random probability measures having supports that behave like ultrametric spaces insofar as they decompose into nested balls. We provide a sufficient condition for a sequence of random probability measures on the unit ball of an infinite dimensional separable Hilbert space to admit such a decomposition, whose elements we call clusters. We also characterize the laws of the measures of the clusters by showing that they converge in law to the weights of a Ruelle Probability Cascade. These results apply to a large class of classical models in mean field spin glasses. We illustrate the notion of approximate ultrametricity by proving two important conjectures regarding mixed p-spin glasses.
△ Less
Submitted 22 December, 2014;
originally announced December 2014.
-
Solution of the propeller conjecture in $\mathbb{R}^3$
Authors:
Steven Heilman,
Aukosh Jagannath,
Assaf Naor
Abstract:
It is shown that every measurable partition ${A_1,..., A_k}$ of $\mathbb{R}^3$ satisfies $$\sum_{i=1}^k||\int_{A_i} xe^{-\frac12||x||_2^2}dx||_2^2\le 9π^2.\qquad(*)$$ Let ${P_1,P_2,P_3}$ be the partition of $\mathbb{R}^2$ into $120^\circ$ sectors centered at the origin. The bound is sharp, with equality holding if $A_i=P_i\times \mathbb{R}$ for $i\in {1,2,3}$ and $A_i=\emptyset$ for…
▽ More
It is shown that every measurable partition ${A_1,..., A_k}$ of $\mathbb{R}^3$ satisfies $$\sum_{i=1}^k||\int_{A_i} xe^{-\frac12||x||_2^2}dx||_2^2\le 9π^2.\qquad(*)$$ Let ${P_1,P_2,P_3}$ be the partition of $\mathbb{R}^2$ into $120^\circ$ sectors centered at the origin. The bound is sharp, with equality holding if $A_i=P_i\times \mathbb{R}$ for $i\in {1,2,3}$ and $A_i=\emptyset$ for $i\in \{4,...,k\}$ (up to measure zero corrections, orthogonal transformations and renumbering of the sets $\{A_1,...,A_k\}$). This settles positively the 3-dimensional Propeller Conjecture of Khot and Naor (FOCS 2008). The proof of reduces the problem to a finite set of numerical inequalities which are then verified with full rigor in a computer-assisted fashion. The main consequence (and motivation) of $(*)$ is complexity-theoretic: the Unique Games hardness threshold of the Kernel Clustering problem with $4 \times 4$ centered and spherical hypothesis matrix equals $\frac{2π}{3}$.
△ Less
Submitted 5 April, 2014; v1 submitted 13 December, 2011;
originally announced December 2011.