-
Universal characteristics of deep neural network loss surfaces from random matrix theory
Authors:
Nicholas P Baskerville,
Jonathan P Keating,
Francesco Mezzadri,
Joseph Najnudel,
Diego Granziol
Abstract:
This paper considers several aspects of random matrix universality in deep neural networks. Motivated by recent experimental work, we use universal properties of random matrices related to local statistics to derive practical implications for deep neural networks based on a realistic model of their Hessians. In particular we derive universal aspects of outliers in the spectra of deep neural networ…
▽ More
This paper considers several aspects of random matrix universality in deep neural networks. Motivated by recent experimental work, we use universal properties of random matrices related to local statistics to derive practical implications for deep neural networks based on a realistic model of their Hessians. In particular we derive universal aspects of outliers in the spectra of deep neural networks and demonstrate the important role of random matrix local laws in popular pre-conditioning gradient descent algorithms. We also present insights into deep neural network loss surfaces from quite general arguments based on tools from statistical physics and random matrix theory.
△ Less
Submitted 20 June, 2022; v1 submitted 17 May, 2022;
originally announced May 2022.
-
A spin-glass model for the loss surfaces of generative adversarial networks
Authors:
Nicholas P Baskerville,
Jonathan P Keating,
Francesco Mezzadri,
Joseph Najnudel
Abstract:
We present a novel mathematical model that seeks to capture the key design feature of generative adversarial networks (GANs). Our model consists of two interacting spin glasses, and we conduct an extensive theoretical analysis of the complexity of the model's critical points using techniques from Random Matrix Theory. The result is insights into the loss surfaces of large GANs that build upon prio…
▽ More
We present a novel mathematical model that seeks to capture the key design feature of generative adversarial networks (GANs). Our model consists of two interacting spin glasses, and we conduct an extensive theoretical analysis of the complexity of the model's critical points using techniques from Random Matrix Theory. The result is insights into the loss surfaces of large GANs that build upon prior insights for simpler networks, but also reveal new structure unique to this setting.
△ Less
Submitted 7 January, 2021;
originally announced January 2021.
-
The Loss Surfaces of Neural Networks with General Activation Functions
Authors:
Nicholas P. Baskerville,
Jonathan P. Keating,
Francesco Mezzadri,
Joseph Najnudel
Abstract:
The loss surfaces of deep neural networks have been the subject of several studies, theoretical and experimental, over the last few years. One strand of work considers the complexity, in the sense of local optima, of high dimensional random functions with the aim of informing how local optimisation methods may perform in such complicated settings. Prior work of Choromanska et al (2015) established…
▽ More
The loss surfaces of deep neural networks have been the subject of several studies, theoretical and experimental, over the last few years. One strand of work considers the complexity, in the sense of local optima, of high dimensional random functions with the aim of informing how local optimisation methods may perform in such complicated settings. Prior work of Choromanska et al (2015) established a direct link between the training loss surfaces of deep multi-layer perceptron networks and spherical multi-spin glass models under some very strong assumptions on the network and its data. In this work, we test the validity of this approach by removing the undesirable restriction to ReLU activation functions. In doing so, we chart a new path through the spin glass complexity calculations using supersymmetric methods in Random Matrix Theory which may prove useful in other contexts. Our results shed new light on both the strengths and the weaknesses of spin glass models in this context.
△ Less
Submitted 8 June, 2021; v1 submitted 8 April, 2020;
originally announced April 2020.
-
Extreme values of CUE characteristic polynomials: a numerical study
Authors:
Yan V. Fyodorov,
Sven Gnutzmann,
Jonathan P. Keating
Abstract:
We present the results of systematic numerical computations relating to the extreme value statistics of the characteristic polynomials of random unitary matrices drawn from the Circular Unitary Ensemble (CUE) of Random Matrix Theory. In particular, we investigate a range of recent conjectures and theoretical results inspired by analogies with the theory of logarithmically-correlated Gaussian rando…
▽ More
We present the results of systematic numerical computations relating to the extreme value statistics of the characteristic polynomials of random unitary matrices drawn from the Circular Unitary Ensemble (CUE) of Random Matrix Theory. In particular, we investigate a range of recent conjectures and theoretical results inspired by analogies with the theory of logarithmically-correlated Gaussian random fields. These include phenomena related to the conjectured freezing transition. Our numerical results are consistent with, and therefore support, the previous conjectures and theory. We also go beyond previous investigations in several directions: we provide the first quantitative evidence in support of a correlation between extreme values of the characteristic polynomials and large gaps in the spectrum, we investigate the rate of convergence to the limiting formulae previously considered, and we extend the previous analysis of the CUE to the C$β$E which corresponds to allowing the degree of the eigenvalue repulsion to become a parameter.
△ Less
Submitted 23 October, 2018; v1 submitted 1 June, 2018;
originally announced June 2018.
-
On relations between one-dimensional quantum and two-dimensional classical spin systems
Authors:
J. Hutchinson,
J. P. Keating,
F. Mezzadri
Abstract:
We exploit mappings between quantum and classical systems in order to obtain a class of two-dimensional classical systems with critical properties equivalent to those of the class of one-dimensional quantum systems discussed in a companion paper (J. Hutchinson, J. P. Keating, and F. Mezzadri, arXiv:1503.05732). In particular, we use three approaches: the Trotter-Suzuki mapping; the method of coher…
▽ More
We exploit mappings between quantum and classical systems in order to obtain a class of two-dimensional classical systems with critical properties equivalent to those of the class of one-dimensional quantum systems discussed in a companion paper (J. Hutchinson, J. P. Keating, and F. Mezzadri, arXiv:1503.05732). In particular, we use three approaches: the Trotter-Suzuki mapping; the method of coherent states; and a calculation based on commuting the quantum Hamiltonian with the transfer matrix of a classical system. This enables us to establish universality of certain critical phenomena by extension from the results in our previous article for the classical systems identified.
△ Less
Submitted 26 March, 2015;
originally announced March 2015.
-
Random matrix theory and critical phenomena in quantum spin chains
Authors:
J. Hutchinson,
J. P. Keating,
F. Mezzadri
Abstract:
We compute critical properties of a general class of quantum spin chains which are quadratic in the Fermi operators and can be solved exactly under certain symmetry constraints related to the classical compact groups $U(N)$, $O(N)$ and $Sp(2N)$. In particular we calculate critical exponents $s$, $ν$ and $z$, corresponding to the energy gap, correlation length and dynamic exponent respectively. We…
▽ More
We compute critical properties of a general class of quantum spin chains which are quadratic in the Fermi operators and can be solved exactly under certain symmetry constraints related to the classical compact groups $U(N)$, $O(N)$ and $Sp(2N)$. In particular we calculate critical exponents $s$, $ν$ and $z$, corresponding to the energy gap, correlation length and dynamic exponent respectively. We also compute the ground state correlators $\left\langle σ^{x}_{i} σ^{x}_{i+n} \right\rangle_{g}$, $\left\langle σ^{y}_{i} σ^{y}_{i+n} \right\rangle_{g}$ and $\left\langle \prod^{n}_{i=1} σ^{z}_{i} \right\rangle_{g}$, all of which display quasi-long-range order with a critical exponent dependent upon system parameters. Our approach establishes universality of the exponents for the class of systems in question.
△ Less
Submitted 19 March, 2015;
originally announced March 2015.
-
Freezing Transitions and Extreme Values: Random Matrix Theory, $ζ(1/2+it)$, and Disordered Landscapes
Authors:
Yan V. Fyodorov,
Jonathan P. Keating
Abstract:
We argue that the freezing transition scenario, previously conjectured to occur in the statistical mechanics of 1/f-noise random energy models, governs, after reinterpretation, the value distribution of the maximum of the modulus of the characteristic polynomials p_N(θ) of large N\times N random unitary (CUE) matrices; i.e. the extreme value statistics of p_N(θ) when N \rightarrow\infty. In additi…
▽ More
We argue that the freezing transition scenario, previously conjectured to occur in the statistical mechanics of 1/f-noise random energy models, governs, after reinterpretation, the value distribution of the maximum of the modulus of the characteristic polynomials p_N(θ) of large N\times N random unitary (CUE) matrices; i.e. the extreme value statistics of p_N(θ) when N \rightarrow\infty. In addition, we argue that it leads to multifractal-like behaviour in the total length μ_N(x) of the intervals in which |p_N(θ)|>N^x, x>0, in the same limit. We speculate that our results extend to the large values taken by the Riemann zeta-function ζ(s) over stretches of the critical line s=1/2+it of given constant length, and present the results of numerical computations of the large values of ζ(1/2+it). Our main purpose is to draw attention to the unexpected connections between these different extreme value problems.
△ Less
Submitted 23 December, 2013; v1 submitted 26 November, 2012;
originally announced November 2012.
-
Freezing Transition, Characteristic Polynomials of Random Matrices, and the Riemann Zeta-Function
Authors:
Yan V. Fyodorov,
Ghaith A Hiary,
Jonathan P. Keating
Abstract:
We argue that the freezing transition scenario, previously explored in the statistical mechanics of 1/f-noise random energy models, also determines the value distribution of the maximum of the modulus of the characteristic polynomials of large N x N random unitary (CUE) matrices. We postulate that our results extend to the extreme values taken by the Riemann zeta-function zeta(s) over sections of…
▽ More
We argue that the freezing transition scenario, previously explored in the statistical mechanics of 1/f-noise random energy models, also determines the value distribution of the maximum of the modulus of the characteristic polynomials of large N x N random unitary (CUE) matrices. We postulate that our results extend to the extreme values taken by the Riemann zeta-function zeta(s) over sections of the critical line s=1/2+it of constant length and present the results of numerical computations in support. Our main purpose is to draw attention to possible connections between the statistical mechanics of random energy landscapes, random matrix theory, and the theory of the Riemann zeta function.
△ Less
Submitted 27 April, 2012; v1 submitted 21 February, 2012;
originally announced February 2012.
-
Eigenfunction Statistics on Quantum Graphs
Authors:
S. Gnutzmann,
J. P. Keating,
F. Piotet
Abstract:
We investigate the spatial statistics of the energy eigenfunctions on large quantum graphs. It has previously been conjectured that these should be described by a Gaussian Random Wave Model, by analogy with quantum chaotic systems, for which such a model was proposed by Berry in 1977. The autocorrelation functions we calculate for an individual quantum graph exhibit a universal component, which co…
▽ More
We investigate the spatial statistics of the energy eigenfunctions on large quantum graphs. It has previously been conjectured that these should be described by a Gaussian Random Wave Model, by analogy with quantum chaotic systems, for which such a model was proposed by Berry in 1977. The autocorrelation functions we calculate for an individual quantum graph exhibit a universal component, which completely determines a Gaussian Random Wave Model, and a system-dependent deviation. This deviation depends on the graph only through its underlying classical dynamics. Classical criteria for quantum universality to be met asymptotically in the large graph limit (i.e. for the non-universal deviation to vanish) are then extracted. We use an exact field theoretic expression in terms of a variant of a supersymmetric sigma model. A saddle-point analysis of this expression leads to the estimates. In particular, intensity correlations are used to discuss the possible equidistribution of the energy eigenfunctions in the large graph limit. When equidistribution is asymptotically realized, our theory predicts a rate of convergence that is a significant refinement of previous estimates. The universal and system-dependent components of intensity correlation functions are recovered by means of an exact trace formula which we analyse in the diagonal approximation, drawing in this way a parallel between the field theory and semiclassics. Our results provide the first instance where an asymptotic Gaussian Random Wave Model has been established microscopically for eigenfunctions in a system with no disorder.
△ Less
Submitted 6 May, 2010;
originally announced May 2010.
-
Quantum ergodicity on graphs
Authors:
S. Gnutzmann,
J. P. Keating,
F. Piotet
Abstract:
We investigate the equidistribution of the eigenfunctions on quantum graphs in the high-energy limit. Our main result is an estimate of the deviations from equidistribution for large well-connected graphs. We use an exact field-theoretic expression in terms of a variant of the supersymmetric nonlinear sigma-model. Our estimate is based on a saddle-point analysis of this expression and leads to a…
▽ More
We investigate the equidistribution of the eigenfunctions on quantum graphs in the high-energy limit. Our main result is an estimate of the deviations from equidistribution for large well-connected graphs. We use an exact field-theoretic expression in terms of a variant of the supersymmetric nonlinear sigma-model. Our estimate is based on a saddle-point analysis of this expression and leads to a criterion for when equidistribution emerges asymptotically in the limit of large graphs. Our theory predicts a rate of convergence that is a significant refinement of previous estimates, long-assumed to be valid for quantum chaotic systems, agreeing with them in some situations but not all. We discuss specific examples for which the theory is tested numerically.
△ Less
Submitted 29 August, 2008;
originally announced August 2008.
-
Fractional $\hbar$-scaling for quantum kicked rotors without cantori
Authors:
J. Wang,
T. S. Monteiro,
S. Fishman,
J. P. Keating,
R. Schubert
Abstract:
Previous studies of quantum delta-kicked rotors have found momentum probability distributions with a typical width (localization length $L$) characterized by fractional $\hbar$-scaling, ie $L \sim \hbar^{2/3}$ in regimes and phase-space regions close to `golden-ratio' cantori. In contrast, in typical chaotic regimes, the scaling is integer, $L \sim \hbar^{-1}$. Here we consider a generic variant…
▽ More
Previous studies of quantum delta-kicked rotors have found momentum probability distributions with a typical width (localization length $L$) characterized by fractional $\hbar$-scaling, ie $L \sim \hbar^{2/3}$ in regimes and phase-space regions close to `golden-ratio' cantori. In contrast, in typical chaotic regimes, the scaling is integer, $L \sim \hbar^{-1}$. Here we consider a generic variant of the kicked rotor, the random-pair-kicked particle (RP-KP), obtained by randomizing the phases every second kick; it has no KAM mixed phase-space structures, like golden-ratio cantori, at all. Our unexpected finding is that, over comparable phase-space regions, it also has fractional scaling, but $L \sim \hbar^{-2/3}$. A semiclassical analysis indicates that the $\hbar^{2/3}$ scaling here is of quantum origin and is not a signature of classical cantori.
△ Less
Submitted 19 October, 2007; v1 submitted 27 February, 2007;
originally announced February 2007.
-
A new correlator in quantum spin chains
Authors:
J. P. Keating,
F. Mezzadri,
M. Novaes
Abstract:
We propose a new correlator in one-dimensional quantum spin chains, the $s-$Emptiness Formation Probability ($s-$EFP). This is a natural generalization of the Emptiness Formation Probability (EFP), which is the probability that the first $n$ spins of the chain are all aligned downwards. In the $s-$EFP we let the spins in question be separated by $s$ sites. The usual EFP corresponds to the specia…
▽ More
We propose a new correlator in one-dimensional quantum spin chains, the $s-$Emptiness Formation Probability ($s-$EFP). This is a natural generalization of the Emptiness Formation Probability (EFP), which is the probability that the first $n$ spins of the chain are all aligned downwards. In the $s-$EFP we let the spins in question be separated by $s$ sites. The usual EFP corresponds to the special case when $s=1$, and taking $s>1$ allows us to quantify non-local correlations. We express the $s-$EFP for the anisotropic XY model in a transverse magnetic field, a system with both critical and non-critical regimes, in terms of a Toeplitz determinant. For the isotropic XY model we find that the magnetic field induces an interesting length scale.
△ Less
Submitted 6 April, 2006;
originally announced April 2006.
-
Universal quantum signature of mixed dynamics in antidot lattices
Authors:
J. P. Keating,
S. D. Prado,
M. Sieber
Abstract:
We investigate phase coherent ballistic transport through antidot lattices in the generic case where the classical phase space has both regular and chaotic components. It is shown that the conductivity fluctuations have a non-Gaussian distribution, and that their moments have a power-law dependence on a semiclassical parameter, with fractional exponents. These exponents are obtained from bifurca…
▽ More
We investigate phase coherent ballistic transport through antidot lattices in the generic case where the classical phase space has both regular and chaotic components. It is shown that the conductivity fluctuations have a non-Gaussian distribution, and that their moments have a power-law dependence on a semiclassical parameter, with fractional exponents. These exponents are obtained from bifurcating periodic orbits in the semiclassical approximation. They are universal in situations where sufficiently long orbits contribute.
△ Less
Submitted 8 September, 2005;
originally announced September 2005.
-
Negative moments of characteristic polynomials of random GOE matrices and singularity-dominated strong fluctuations
Authors:
Yan V. Fyodorov,
Jonathan P. Keating
Abstract:
We calculate the negative integer moments of the (regularized) characteristic polynomials of N x N random matrices taken from the Gaussian Orthogonal Ensemble (GOE) in the limit as $N \to \infty$. The results agree nontrivially with a recent conjecture of Berry & Keating motivated by techniques developed in the theory of singularity-dominated strong fluctuations. This is the first example where…
▽ More
We calculate the negative integer moments of the (regularized) characteristic polynomials of N x N random matrices taken from the Gaussian Orthogonal Ensemble (GOE) in the limit as $N \to \infty$. The results agree nontrivially with a recent conjecture of Berry & Keating motivated by techniques developed in the theory of singularity-dominated strong fluctuations. This is the first example where nontrivial predictions obtained using these techniques have been proved.
△ Less
Submitted 17 December, 2002;
originally announced December 2002.
-
Force and impulse from an Aharonov-Bohm flux line
Authors:
J. P. Keating,
J. M. Robbins
Abstract:
We calculate the force and impulse operators for a charged particle in the field of an Aharonov-Bohm flux line. The force operator is formally the Lorentz force, with the magnetic field operator modified to include quantum corrections due to anomolous commutation relations. Expectation values for stationary states are calculated. Nonstationary states are treated by integrating the force operator…
▽ More
We calculate the force and impulse operators for a charged particle in the field of an Aharonov-Bohm flux line. The force operator is formally the Lorentz force, with the magnetic field operator modified to include quantum corrections due to anomolous commutation relations. Expectation values for stationary states are calculated. Nonstationary states are treated by integrating the force operator in time to obtain the impulse operator. Expectation values of the impulse are calculated for slow wavepackets (which spread faster than they move) and for fast wavepackets (which spread only negligibly before their closest approach to the flux line). We give two derivations of the force and impulse operators, the first a simple derivation based on formal arguments, and the second a rigorous calculation of wavepacket expectation values. We also show that the same expressions for the force and impulse are obtained if the flux line is enclosed in an impenetrable cylinder,or distributed uniformly over a flux cylinder, in the limit that the radius of the cylinder goes to zero.
△ Less
Submitted 6 December, 2002;
originally announced December 2002.