-
Genus expansion for non-linear random matrix ensembles with applications to neural networks
Authors:
Nicola Muca Cirone,
Jad Hamdan,
Cristopher Salvi
Abstract:
We present a unified approach to studying certain non-linear random matrix ensembles and associated random neural networks at initialization. This begins with a novel series expansion for neural networks which generalizes Faá di Bruno's formula to an arbitrary number of compositions. The role of monomials is played by random multilinear maps indexed by directed graphs, whose edges correspond to ra…
▽ More
We present a unified approach to studying certain non-linear random matrix ensembles and associated random neural networks at initialization. This begins with a novel series expansion for neural networks which generalizes Faá di Bruno's formula to an arbitrary number of compositions. The role of monomials is played by random multilinear maps indexed by directed graphs, whose edges correspond to random matrices. Crucially, this expansion linearizes the effect of the activation functions, allowing for the direct application of Wick's principle and the genus expansion technique. As an application, we prove several results about neural networks with random weights. We first give a new proof of the fact that they converge to Gaussian processes as their width tends to infinity. Secondly, we quantify the rate of convergence of the Neural Tangent Kernel to its deterministic limit in Frobenius norm. Finally, we compute the moments of the limiting spectral distribution of the Jacobian (only the first two of which were previously known), expressing them as sums over non-crossing partitions. All of these results are then generalised to the case of neural networks with sparse and non-Gaussian weights, under moment assumptions.
△ Less
Submitted 13 May, 2025; v1 submitted 11 July, 2024;
originally announced July 2024.
-
The Fyodorov-Hiary-Keating Conjecture on Mesoscopic Intervals
Authors:
Louis-Pierre Arguin,
Jad Hamdan
Abstract:
We derive precise upper bounds for the maximum of the Riemann zeta function on short intervals on the critical line, showing for any $θ\in(-1,0]$, the set of $t\in [T,2T]$ for which $$\max_{|h|\leq \log^θT}|ζ(\tfrac{1}{2}+it+ih)|>\exp\bigg({y+\sqrt{(\log\log T)|θ|/2}\cdot {S}}\bigg)\frac{(\log T)^{(1+θ)}}{(\log\log T)^{3/4}}$$ is bounded above by $Cy\exp({-2y-y^2/((1+θ)\log\log T)})$ (where $S$ is…
▽ More
We derive precise upper bounds for the maximum of the Riemann zeta function on short intervals on the critical line, showing for any $θ\in(-1,0]$, the set of $t\in [T,2T]$ for which $$\max_{|h|\leq \log^θT}|ζ(\tfrac{1}{2}+it+ih)|>\exp\bigg({y+\sqrt{(\log\log T)|θ|/2}\cdot {S}}\bigg)\frac{(\log T)^{(1+θ)}}{(\log\log T)^{3/4}}$$ is bounded above by $Cy\exp({-2y-y^2/((1+θ)\log\log T)})$ (where $S$ is a random variable that is approximately a standard Gaussian as $T$ tends to infinity). This settles a strong form of a conjecture of Fyodorov--Hiary--Keating in mesoscopic intervals which was only known in the leading order. Using similar techniques, we also derive upper bounds for the second moment of the zeta function on such intervals. Conditioning on the value of $S$, we show that for all $t\in[T,2T]$ outside a set of order $o(T)$, $$\frac{1}{\log^θT}\int_{|h|\in \log^θT} |ζ(\tfrac{1}{2}+it+ih)|^2\mathrm{d}h \ll e^{2S}\cdot \left(\frac{(\log T)^{(1+θ)}}{\sqrt{\log\log T}}\right).$$ This proves a weak form of another conjecture of Fyodorov-Keating and generalizes a result of Harper, which is recovered at $θ= 0$ (in which case $S$ is defined to be zero). Our main tool is an adaptation of the recursive scheme introduced by one of the authors, Bourgade and Radziwiłł to mesoscopic intervals.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
A Fixed-Point Approach to Non-Commutative Central Limit Theorems
Authors:
Jad Hamdan
Abstract:
We show how the renormalization group approach can be used to prove quantitative central limit theorems (CLTs) in the setting of free, Boolean, bi--free and bi--Boolean independence under finite third moment assumptions. The proofs rely on the construction of a contractive metric over the space of probability measures over $\mathbb{R}$ or $\mathbb{R}^2$, which has the appropriate analogue of a Gau…
▽ More
We show how the renormalization group approach can be used to prove quantitative central limit theorems (CLTs) in the setting of free, Boolean, bi--free and bi--Boolean independence under finite third moment assumptions. The proofs rely on the construction of a contractive metric over the space of probability measures over $\mathbb{R}$ or $\mathbb{R}^2$, which has the appropriate analogue of a Gaussian distribution as a fixed point (for instance, the semi--circle law in the case of free independence). In all cases, this yields a convergence rate of $1/\sqrt{n}$, and we show that this can be improved to $1/n$ in some instances under stronger assumptions.
△ Less
Submitted 28 February, 2024; v1 submitted 11 May, 2023;
originally announced May 2023.
-
A note on the exact simulation of a random eigenvalue of a GUE matrix
Authors:
Luc Devroye,
Jad Hamdan
Abstract:
We develop a simple algorithm to generate random variables described by densities equaling squared Hermite functions. As an application, we show how to generate a randomly chosen eigenvalue of a matrix from the Gaussian Unitary Ensemble ({\textsc{gue}}) in sub-linear expected time.
We develop a simple algorithm to generate random variables described by densities equaling squared Hermite functions. As an application, we show how to generate a randomly chosen eigenvalue of a matrix from the Gaussian Unitary Ensemble ({\textsc{gue}}) in sub-linear expected time.
△ Less
Submitted 25 March, 2025; v1 submitted 7 April, 2023;
originally announced April 2023.
-
Density estimation using cellular binary trees and an application to monotone densities
Authors:
Luc Devroye,
Jad Hamdan
Abstract:
Consider a density $f$ on $[0,1]$ that must be estimated from an i.i.d. sample $X_1,...,X_n$ drawn from $f$. In this note, we study binary-tree-based histogram estimates that use recursive splitting of intervals. If the decision to split an interval is a (possibly randomized) function of the number of data points in the interval only, then we speak of an estimate of complexity one. We exhibit a un…
▽ More
Consider a density $f$ on $[0,1]$ that must be estimated from an i.i.d. sample $X_1,...,X_n$ drawn from $f$. In this note, we study binary-tree-based histogram estimates that use recursive splitting of intervals. If the decision to split an interval is a (possibly randomized) function of the number of data points in the interval only, then we speak of an estimate of complexity one. We exhibit a universally consistent estimate of complexity one. If the decision to split is a function of the cardinalities of k equal-length sub-intervals, then we speak of an estimate of complexity k. We propose an estimate of complexity two that can estimate any bounded monotone density on $[0,1]$ with optimal expected total variation error $O(n^{-1/3})$.
△ Less
Submitted 22 April, 2025; v1 submitted 15 March, 2022;
originally announced March 2022.
-
The lattice of arithmetic progressions
Authors:
Marcel K. Goh,
Jad Hamdan,
Jonah Saks
Abstract:
This paper concerns the lattice $L_n$ of subsets of $\{1,\ldots,n\}$ that are arithmetic progressions, under the inclusion order. For $n\geq 4$, this poset is not graded and thus not semimodular. We give three independent proofs of the fact that for $n\geq 2$, $μ_n(L_n) = μ(n-1)$, where $μ_n$ is the Möbius function of $L_n$ and $μ$ is the classical (number-theoretic) Möbius function. We also show…
▽ More
This paper concerns the lattice $L_n$ of subsets of $\{1,\ldots,n\}$ that are arithmetic progressions, under the inclusion order. For $n\geq 4$, this poset is not graded and thus not semimodular. We give three independent proofs of the fact that for $n\geq 2$, $μ_n(L_n) = μ(n-1)$, where $μ_n$ is the Möbius function of $L_n$ and $μ$ is the classical (number-theoretic) Möbius function. We also show that $L_n$ is comodernistic, which implies that $L_n$ is EL-labelable. Comodernism is then used to prove that the order complex $Δ_n$ of the lattice is either contractible or homotopy equivalent to a sphere.
△ Less
Submitted 2 September, 2021; v1 submitted 10 June, 2021;
originally announced June 2021.
-
Universal height and width bounds for random trees
Authors:
Louigi Addario-Berry,
Anna Brandenberger,
Jad Hamdan,
Céline Kerriou
Abstract:
We prove non-asymptotic stretched exponential tail bounds on the height of a randomly sampled node in a random combinatorial tree, which we use to prove bounds on the heights and widths of random trees from a variety of models. Our results allow us to prove a conjecture and settle an open problem of Janson (https://doi.org/10.1214/11-PS188), and nearly prove another conjecture and settle another o…
▽ More
We prove non-asymptotic stretched exponential tail bounds on the height of a randomly sampled node in a random combinatorial tree, which we use to prove bounds on the heights and widths of random trees from a variety of models. Our results allow us to prove a conjecture and settle an open problem of Janson (https://doi.org/10.1214/11-PS188), and nearly prove another conjecture and settle another open problem from the same work (up to a polylogarithmic factor).
The key tool for our work is an equivalence in law between the degrees along the path to a random node in a random tree with given degree statistics, and a random truncation of a size-biased ordering of the degrees of such a tree. We also exploit a Poissonization trick introduced by Camarri and Pitman (https://doi.org/10.1214/EJP.v5-58) in the context of inhomogeneous continuum random trees, which we adapt to the setting of random trees with fixed degrees.
Finally, we propose and justify a change to the conventions of branching process nomenclature: the name "Galton-Watson trees" should be permanently retired by the community, and replaced with the name "Bienaymé trees".
△ Less
Submitted 25 April, 2022; v1 submitted 7 May, 2021;
originally announced May 2021.