-
Joint Learning in the Gaussian Single Index Model
Authors:
Loucas Pillaud-Vivien,
Adrien Schertzer
Abstract:
We consider the problem of jointly learning a one-dimensional projection and a univariate function in high-dimensional Gaussian models. Specifically, we study predictors of the form $f(x)=\varphi^\star(\langle w^\star, x \rangle)$, where both the direction $w^\star \in \mathcal{S}_{d-1}$, the sphere of $\mathbb{R}^d$, and the function $\varphi^\star: \mathbb{R} \to \mathbb{R}$ are learned from Gau…
▽ More
We consider the problem of jointly learning a one-dimensional projection and a univariate function in high-dimensional Gaussian models. Specifically, we study predictors of the form $f(x)=\varphi^\star(\langle w^\star, x \rangle)$, where both the direction $w^\star \in \mathcal{S}_{d-1}$, the sphere of $\mathbb{R}^d$, and the function $\varphi^\star: \mathbb{R} \to \mathbb{R}$ are learned from Gaussian data. This setting captures a fundamental non-convex problem at the intersection of representation learning and nonlinear regression. We analyze the gradient flow dynamics of a natural alternating scheme and prove convergence, with a rate controlled by the information exponent reflecting the \textit{Gaussian regularity} of the function $\varphi^\star$. Strikingly, our analysis shows that convergence still occurs even when the initial direction is negatively correlated with the target. On the practical side, we demonstrate that such joint learning can be effectively implemented using a Reproducing Kernel Hilbert Space (RKHS) adapted to the structure of the problem, enabling efficient and flexible estimation of the univariate function. Our results offer both theoretical insight and practical methodology for learning low-dimensional structure in high-dimensional settings.
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
On the Replica Symmetry of a Variant of the Sherrington-Kirkpatrick Spin Glass
Authors:
Christian Brennecke,
Adrien Schertzer
Abstract:
We consider $N$ i.i.d. Ising spins with mean $m\in (-1,1)$ whose interactions are described by a Sherrington-Kirkpatrick Hamiltonian with a quartic correction. This model was recently introduced by Bolthausen in \cite{Bolt2} as a toy model to understand whether a second moment argument can be used to derive the replica symmetric formula in the full high temperature regime if $m\neq 0$. In \cite{Bo…
▽ More
We consider $N$ i.i.d. Ising spins with mean $m\in (-1,1)$ whose interactions are described by a Sherrington-Kirkpatrick Hamiltonian with a quartic correction. This model was recently introduced by Bolthausen in \cite{Bolt2} as a toy model to understand whether a second moment argument can be used to derive the replica symmetric formula in the full high temperature regime if $m\neq 0$. In \cite{Bolt2}, Bolthausen suggested that a natural analogue of the de Almeida-Thouless condition for the toy model is
\begin{equation}\label{eq:conj} β^2(1-m^2)^2\leq 1. \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, \, (1)\end{equation} Here, $β\geq 0$ corresponds to the inverse temperature. While the second moment method implies replica symmetry for $β$ sufficiently small, Bolthausen showed that the method fails to prove replica symmetry in the full region described by (1). A natural question that was left open in \cite{Bolt2} is whether (1) correctly characterizes the high temperature phase of the toy model. In this note, we show that this is indeed not the case. We prove that if $|m| \geq m_*$, for some $m_* \in (0,1)$, the limiting free energy of the toy model is negative for suitable $β$ that satisfy (1).
△ Less
Submitted 9 December, 2024; v1 submitted 5 December, 2024;
originally announced December 2024.
-
Stochastic Differential Equations models for Least-Squares Stochastic Gradient Descent
Authors:
Adrien Schertzer,
Loucas Pillaud-Vivien
Abstract:
We study the dynamics of a continuous-time model of the Stochastic Gradient Descent (SGD) for the least-square problem. Indeed, pursuing the work of Li et al. (2019), we analyze Stochastic Differential Equations (SDEs) that model SGD either in the case of the training loss (finite samples) or the population one (online setting). A key qualitative feature of the dynamics is the existence of a perfe…
▽ More
We study the dynamics of a continuous-time model of the Stochastic Gradient Descent (SGD) for the least-square problem. Indeed, pursuing the work of Li et al. (2019), we analyze Stochastic Differential Equations (SDEs) that model SGD either in the case of the training loss (finite samples) or the population one (online setting). A key qualitative feature of the dynamics is the existence of a perfect interpolator of the data, irrespective of the sample size. In both scenarios, we provide precise, non-asymptotic rates of convergence to the (possibly degenerate) stationary distribution. Additionally, we describe this asymptotic distribution, offering estimates of its mean, deviations from it, and a proof of the emergence of heavy-tails related to the step-size magnitude. Numerical simulations supporting our findings are also presented.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Spin Covariance Fluctuations in the SK Model at High Temperature
Authors:
Christian Brennecke,
Adrien Schertzer,
Chen Van Dam
Abstract:
Based on \cite{H}, it is well known that the rescaled two point correlation functions
\[ \sqrt{N} \langle σ_i ; σ_j\rangle = \sqrt{N} \big( \langle σ_i σ_j\rangle -\langle σ_i\rangle \langle σ_j\rangle\big) \] in the Sherrington-Kirkpatrick spin glass model with non-zero external field admit at sufficiently high temperature an explicit non-Gaussian distributional limit as $N\to \infty$. Inspired…
▽ More
Based on \cite{H}, it is well known that the rescaled two point correlation functions
\[ \sqrt{N} \langle σ_i ; σ_j\rangle = \sqrt{N} \big( \langle σ_i σ_j\rangle -\langle σ_i\rangle \langle σ_j\rangle\big) \] in the Sherrington-Kirkpatrick spin glass model with non-zero external field admit at sufficiently high temperature an explicit non-Gaussian distributional limit as $N\to \infty$. Inspired by recent results from \cite{ABSY, BSXY, BXY}, we provide a novel proof of the distributional convergence which is based on expanding $\langle σ_i ; σ_j\rangle$ into a sum over suitable weights of self-avoiding paths from vertex $i$ to $j$. Compared to \cite{H}, our key observation is that the path representation of $\langle σ_i ; σ_j\rangle$ provides a direct explanation of the specific form of the limiting distribution of $\sqrt{N} \langle σ_i ; σ_j\rangle$.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
AMP algorithms and Stein's method: Understanding TAP equations with a new method
Authors:
Stephan Gufler,
Adrien Schertzer,
Marius A. Schmidt
Abstract:
We propose a new iterative construction of solutions of the classical TAP equations for the Sherrington-Kirkpatrick model, i.e. with finite-size Onsager correction. The algorithm can be started in an arbitrary point, and converges up to the AT line. The analysis relies on a novel treatment of mean field algorithms through Stein's method. As such, the approach also yields weak convergence of the ef…
▽ More
We propose a new iterative construction of solutions of the classical TAP equations for the Sherrington-Kirkpatrick model, i.e. with finite-size Onsager correction. The algorithm can be started in an arbitrary point, and converges up to the AT line. The analysis relies on a novel treatment of mean field algorithms through Stein's method. As such, the approach also yields weak convergence of the effective fields at all temperatures towards Gaussians, and can be applied, upon proper alterations, to all models where TAP-like equations and a Stein-operator are available.
△ Less
Submitted 20 November, 2023;
originally announced November 2023.
-
The Two Point Function of the SK Model without External Field at High Temperature
Authors:
Christian Brennecke,
Adrien Schertzer,
Changji Xu,
Horng-Tzer Yau
Abstract:
We show that the two point correlation matrix $ \textbf{M}= (\langle σ_i σ_j\rangle)_{1\leq i,j\leq N} $ of the Sherrington-Kirkpatrick model with zero external field satisfies
\[ \lim_{N\to\infty} \| \textbf{M} - ( 1+β^2 - β\textbf{G})^{-1} \|_{\text{op}} =0 \] in probability, in the full high temperature regime $β< 1$. Here, $\textbf{G}$ denotes the GOE interaction matrix of the model.
We show that the two point correlation matrix $ \textbf{M}= (\langle σ_i σ_j\rangle)_{1\leq i,j\leq N} $ of the Sherrington-Kirkpatrick model with zero external field satisfies
\[ \lim_{N\to\infty} \| \textbf{M} - ( 1+β^2 - β\textbf{G})^{-1} \|_{\text{op}} =0 \] in probability, in the full high temperature regime $β< 1$. Here, $\textbf{G}$ denotes the GOE interaction matrix of the model.
△ Less
Submitted 8 November, 2023; v1 submitted 29 December, 2022;
originally announced December 2022.
-
On concavity of TAP free energy in the SK model
Authors:
Stephan Gufler,
Adrien Schertzer,
Marius A. Schmidt
Abstract:
We analyse the Hessian of the Thouless-Anderson-Palmer (TAP) free energy for the Sherrington-Kirkpatrick model, below the de Almeida-Thouless line, evaluated in Bolthausen's approximate solutions of the TAP equations. We show that the empirical spectral distribution weakly converges to a measure with negative support below the AT line, an that the support includes zero on the AT line. In this ``ma…
▽ More
We analyse the Hessian of the Thouless-Anderson-Palmer (TAP) free energy for the Sherrington-Kirkpatrick model, below the de Almeida-Thouless line, evaluated in Bolthausen's approximate solutions of the TAP equations. We show that the empirical spectral distribution weakly converges to a measure with negative support below the AT line, an that the support includes zero on the AT line. In this ``macroscopic'' sense, TAP free energy is concave in the order parameter of the theory, i.e. the random spin-magnetisations. This proves a spectral interpretation of the AT line. However, for specific magnetizations, the Hessian of the TAP free energy can have positive outlier eigenvalues. The question whether such outliers may also occur close to the TAP solutions is left open. In a simplified setting where the magnetizations are independent of the disorder, we prove that Plefka's second condition is equivalent to all eigenvalues being negative.
△ Less
Submitted 7 July, 2023; v1 submitted 19 September, 2022;
originally announced September 2022.
-
Fluctuations of the free energy in p-spin SK models on two scales
Authors:
Anton Bovier,
Adrien Schertzer
Abstract:
20 years ago, Bovier, Kurkova, and Löwe [5] proved a central limit theorem (CLT) for the fluctuations of the free energy in the p-spin version of the Sherrington-Kirkpatrick model of spin glasses at high temperatures. In this paper we improve their results in two ways. First, we extend the range of temperatures to cover the entire regime where the quenched and annealed free energies are known to c…
▽ More
20 years ago, Bovier, Kurkova, and Löwe [5] proved a central limit theorem (CLT) for the fluctuations of the free energy in the p-spin version of the Sherrington-Kirkpatrick model of spin glasses at high temperatures. In this paper we improve their results in two ways. First, we extend the range of temperatures to cover the entire regime where the quenched and annealed free energies are known to coincide. Second, we identify the main source of the fluctuations as a purely coupling dependent term, and we show a further CLT for the deviation of the free energy around this random object.
△ Less
Submitted 30 May, 2022;
originally announced May 2022.
-
Undirected polymers in random environment: path properties in the mean field limit
Authors:
Nicola Kistler,
Adrien Schertzer
Abstract:
We consider the problem of undirected polymers (tied at the endpoints) in random environment, also known as the unoriented first passage percolation on the hypercube, in the limit of large dimensions. By means of the multiscale refinement of the second moment method we obtain a fairly precise geometrical description of optimal paths, i.e. of polymers with minimal energy. The picture which emerges…
▽ More
We consider the problem of undirected polymers (tied at the endpoints) in random environment, also known as the unoriented first passage percolation on the hypercube, in the limit of large dimensions. By means of the multiscale refinement of the second moment method we obtain a fairly precise geometrical description of optimal paths, i.e. of polymers with minimal energy. The picture which emerges can be loosely summarized as follows. The energy of the polymer is, to first approximation, uniformly spread along the strand. The polymer's bonds carry however a lower energy than in the directed setting, and are reached through the following geometrical evolution. Close to the origin, the polymer proceeds in oriented fashion -- it is thus as stretched as possible. The tension of the strand decreases however gradually, with the polymer allowing for more and more backsteps as it enters the core of the hypercube. Backsteps, although increasing the length of the strand, allow the polymer to connect reservoirs of energetically favorable edges which are otherwise unattainable in a fully directed regime. These reservoirs lie at mesoscopic distance apart, but in virtue of the high dimensional nature of the ambient space, the polymer manages to connect them through approximate geodesics with respect to the Hamming metric: this is the key strategy which leads to an optimal energy/entropy balance. Around halfway, the mirror picture sets in: the polymer tension gradually builds up again, until full orientedness close to the endpoint. The approach yields, as a corollary, a constructive proof of the result by Martinsson [Ann. Appl. Prob. 26 (2016), Ann. Prob. 46 (2018)] concerning the leading order of the ground state.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
From Parisi to Boltzmann
Authors:
Goetz Kersting,
Nicola Kistler,
Adrien Schertzer,
Marius A. Schmidt
Abstract:
We sketch a new framework for the analysis of disordered systems, in particular mean field spin glasses, which is variational in nature and within the formalism of classical thermodynamics. For concreteness, only the Sherrington-Kirkpatrick model is considered here. For this we show how the Parisi solution (replica symmetric, or when replica symmetry is broken) emerges, in large but finite volumes…
▽ More
We sketch a new framework for the analysis of disordered systems, in particular mean field spin glasses, which is variational in nature and within the formalism of classical thermodynamics. For concreteness, only the Sherrington-Kirkpatrick model is considered here. For this we show how the Parisi solution (replica symmetric, or when replica symmetry is broken) emerges, in large but finite volumes, from a high temperature expansion to second order of the Gibbs potential with respect to order parameters encoding the law of the effective fields. In contrast with classical systems where convexity in the order parameters is the default situation, the functionals employed here are, at infinite temperature, concave: this feature is eventually due to the Gaussian nature of the interaction and implies, in particular, that the canonical Boltzmann-Gibbs variational principles must be reversed. The considerations suggest that thermodynamical phase transitions are intimately related to the divergence of the infinite expansions.
△ Less
Submitted 25 February, 2019; v1 submitted 3 February, 2019;
originally announced February 2019.
-
Oriented first passage percolation in the mean field limit, 2. The extremal process
Authors:
Nicola Kistler,
Adrien Schertzer,
Marius A. Schmidt
Abstract:
This is the second, and last paper in which we address the behavior of oriented first passage percolation on the hypercube in the limit of large dimensions. We prove here that the extremal process converges to a Cox process with exponential intensity. This entails, in particular, that the first passage time converges weakly to a random shift of the Gumbel distribution. The random shift, which has…
▽ More
This is the second, and last paper in which we address the behavior of oriented first passage percolation on the hypercube in the limit of large dimensions. We prove here that the extremal process converges to a Cox process with exponential intensity. This entails, in particular, that the first passage time converges weakly to a random shift of the Gumbel distribution. The random shift, which has an explicit, universal distribution related to modified Bessel functions of the second kind, is the sole manifestation of correlations ensuing from the geometry of Euclidean space in infinite dimensions. The proof combines the multiscale refinement of the second moment method with a conditional version of the Chen-Stein bounds, and a contraction principle.
△ Less
Submitted 15 August, 2018; v1 submitted 14 August, 2018;
originally announced August 2018.
-
First passage percolation in the mean field limit
Authors:
Nicola Kistler,
Adrien Schertzer,
Marius A. Schmidt
Abstract:
The Poisson clumping heuristic has lead Aldous to conjecture the value of the first passage percolation on the hypercube in the limit of large dimensions. Aldous' conjecture has been rigorously confirmed by Fill and Pemantle [Annals of Applied Prob- ability 3 (1993)] by means of a variance reduction trick. We present here a streamlined and, we believe, more natural proof based on ideas emerged in…
▽ More
The Poisson clumping heuristic has lead Aldous to conjecture the value of the first passage percolation on the hypercube in the limit of large dimensions. Aldous' conjecture has been rigorously confirmed by Fill and Pemantle [Annals of Applied Prob- ability 3 (1993)] by means of a variance reduction trick. We present here a streamlined and, we believe, more natural proof based on ideas emerged in the study of Derrida's random energy models.
△ Less
Submitted 9 April, 2018;
originally announced April 2018.