-
Symmetries, flat minima, and the conserved quantities of gradient flow
Authors:
Bo Zhao,
Iordan Ganev,
Robin Walters,
Rose Yu,
Nima Dehmamy
Abstract:
Empirical studies of the loss landscape of deep networks have revealed that many local minima are connected through low-loss valleys. Yet, little is known about the theoretical origin of such valleys. We present a general framework for finding continuous symmetries in the parameter space, which carve out low-loss valleys. Our framework uses equivariances of the activation functions and can be appl…
▽ More
Empirical studies of the loss landscape of deep networks have revealed that many local minima are connected through low-loss valleys. Yet, little is known about the theoretical origin of such valleys. We present a general framework for finding continuous symmetries in the parameter space, which carve out low-loss valleys. Our framework uses equivariances of the activation functions and can be applied to different layer architectures. To generalize this framework to nonlinear neural networks, we introduce a novel set of nonlinear, data-dependent symmetries. These symmetries can transform a trained model such that it performs similarly on new samples, which allows ensemble building that improves robustness under certain adversarial attacks. We then show that conserved quantities associated with linear symmetries can be used to define coordinates along low-loss valleys. The conserved quantities help reveal that using common initialization methods, gradient flow only explores a small part of the global minimum. By relating conserved quantities to convergence rate and sharpness of the minimum, we provide insights on how initialization impacts convergence and generalizability.
△ Less
Submitted 23 March, 2023; v1 submitted 31 October, 2022;
originally announced October 2022.
-
Quiver neural networks
Authors:
Iordan Ganev,
Robin Walters
Abstract:
We develop a uniform theoretical approach towards the analysis of various neural network connectivity architectures by introducing the notion of a quiver neural network. Inspired by quiver representation theory in mathematics, this approach gives a compact way to capture elaborate data flows in complex network architectures. As an application, we use parameter space symmetries to prove a lossless…
▽ More
We develop a uniform theoretical approach towards the analysis of various neural network connectivity architectures by introducing the notion of a quiver neural network. Inspired by quiver representation theory in mathematics, this approach gives a compact way to capture elaborate data flows in complex network architectures. As an application, we use parameter space symmetries to prove a lossless model compression algorithm for quiver neural networks with certain non-pointwise activations known as rescaling activations. In the case of radial rescaling activations, we prove that training the compressed model with gradient descent is equivalent to training the original model with projected gradient descent.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
Universal approximation and model compression for radial neural networks
Authors:
Iordan Ganev,
Twan van Laarhoven,
Robin Walters
Abstract:
We introduce a class of fully-connected neural networks whose activation functions, rather than being pointwise, rescale feature vectors by a function depending only on their norm. We call such networks radial neural networks, extending previous work on rotation equivariant networks that considers rescaling activations in less generality. We prove universal approximation theorems for radial neural…
▽ More
We introduce a class of fully-connected neural networks whose activation functions, rather than being pointwise, rescale feature vectors by a function depending only on their norm. We call such networks radial neural networks, extending previous work on rotation equivariant networks that considers rescaling activations in less generality. We prove universal approximation theorems for radial neural networks, including in the more difficult cases of bounded widths and unbounded domains. Our proof techniques are novel, distinct from those in the pointwise case. Additionally, radial neural networks exhibit a rich group of orthogonal change-of-basis symmetries on the vector space of trainable parameters. Factoring out these symmetries leads to a practical lossless model compression algorithm. Optimization of the compressed model by gradient descent is equivalent to projected gradient descent for the full model.
△ Less
Submitted 16 February, 2023; v1 submitted 6 July, 2021;
originally announced July 2021.
-
Quantum Weyl algebras and reflection equation algebras at a root of unity
Authors:
Nicholas Cooney,
Iordan Ganev,
David Jordan
Abstract:
We compute the center and Azumaya locus in the simplest non-abelian examples of quantized multiplicative quiver varieties at a root of unity: quantum Weyl algebras of rank $N$, and quantum differential operators on the quantum group $\mathrm{GL}_2$. These examples illustrate in elementary terms much more general phenomena explored further in [Ganev-Jordan-Safronov 2019].
We compute the center and Azumaya locus in the simplest non-abelian examples of quantized multiplicative quiver varieties at a root of unity: quantum Weyl algebras of rank $N$, and quantum differential operators on the quantum group $\mathrm{GL}_2$. These examples illustrate in elementary terms much more general phenomena explored further in [Ganev-Jordan-Safronov 2019].
△ Less
Submitted 25 July, 2019;
originally announced July 2019.
-
The quantum Frobenius for character varieties and multiplicative quiver varieties
Authors:
Iordan Ganev,
David Jordan,
Pavel Safronov
Abstract:
We prove that quantized multiplicative quiver varieties and quantum character varieties define sheaves of Azumaya algebras over the corresponding classical moduli spaces, and we prove that the Azumaya locus of the Kauffman bracket skein algebras contains the smooth locus, proving a strong form of the Unicity Conjecture of Bonahon and Wong. The proofs exploit a strong compatibility between quantum…
▽ More
We prove that quantized multiplicative quiver varieties and quantum character varieties define sheaves of Azumaya algebras over the corresponding classical moduli spaces, and we prove that the Azumaya locus of the Kauffman bracket skein algebras contains the smooth locus, proving a strong form of the Unicity Conjecture of Bonahon and Wong. The proofs exploit a strong compatibility between quantum Hamiltonian reduction and the quantum Frobenius homomorphism as it arises in each setting. We therefore introduce the concepts of Frobenius quantum moment maps and their Hamiltonian reduction, and of Frobenius Poisson orders. We use these tools to construct canonical central subalgebras of quantum algebras, and explicitly compute the resulting Azumaya loci we encounter, using a natural nondegeneracy assumption.
△ Less
Submitted 30 March, 2020; v1 submitted 31 January, 2019;
originally announced January 2019.
-
Wonderful asymptotics of matrix coefficient D-modules
Authors:
David Ben-Zvi,
Iordan Ganev
Abstract:
Beilinson-Bernstein localization realizes representations of complex reductive Lie algebras as monodromic $D$-modules on the "basic affine space" $G/N$, a torus bundle over the flag variety. A doubled version of the same space appears as the horocycle space describing the geometry of the reductive group $G$ at infinity, near the closed stratum of the wonderful compactification $\overline{G}$, or e…
▽ More
Beilinson-Bernstein localization realizes representations of complex reductive Lie algebras as monodromic $D$-modules on the "basic affine space" $G/N$, a torus bundle over the flag variety. A doubled version of the same space appears as the horocycle space describing the geometry of the reductive group $G$ at infinity, near the closed stratum of the wonderful compactification $\overline{G}$, or equivalently in the special fiber of the Vinberg semigroup of $G$. We show that Beilinson-Bernstein localization for $U\mathfrak g$-bimodules arises naturally as the specialization at infinity in $\overline{G}$ of the $D$-modules on $G$ describing matrix coefficients of Lie algebra representations. More generally, the asymptotics of matrix coefficient $D$-modules along any stratum of $\overline{G}$ are given by the matrix coefficient $D$-modules for parabolic restrictions. This provides a simple algebraic derivation of the relation between growth of matrix coefficients of admissible representations and $\mathfrak n$-homology. The result is an elementary consequence of the compatibility of localization with the degeneration of affine $G$-varieties to their asymptotic cones; analogous results hold for the asymptotics of the equations describing spherical functions on symmetric spaces.
△ Less
Submitted 5 July, 2022; v1 submitted 4 January, 2019;
originally announced January 2019.
-
The wonderful compactification for quantum groups
Authors:
Iordan Ganev
Abstract:
In this paper, we introduce a quantum version of the wonderful compactification of a group as a certain noncommutative projective scheme. Our approach stems from the fact that the wonderful compactification encodes the asymptotics of matrix coefficients, and from its realization as a GIT quotient of the Vinberg semigroup. In order to define the wonderful compactification for a quantum group, we ad…
▽ More
In this paper, we introduce a quantum version of the wonderful compactification of a group as a certain noncommutative projective scheme. Our approach stems from the fact that the wonderful compactification encodes the asymptotics of matrix coefficients, and from its realization as a GIT quotient of the Vinberg semigroup. In order to define the wonderful compactification for a quantum group, we adopt a generalized formalism of $\mathsf{Proj}$ categories in the spirit of Artin and Zhang. Key to our construction is a quantum version of the Vinberg semigroup, which we define as a $q$-deformation of a certain Rees algebra, compatible with a standard Poisson structure. Furthermore, we discuss quantum analogues of the stratification of the wonderful compactification by orbits for a certain group action, and provide explicit computations in the case of $\mathrm{SL}_2$.
△ Less
Submitted 15 September, 2016;
originally announced September 2016.
-
Quantizations of multiplicative hypertoric varieties at a root of unity
Authors:
Iordan Ganev
Abstract:
We construct quantizations of multiplicative hypertoric varieties using an algebra of q-difference operators on affine space, where q is a root of unity in C. The quantization defines a matrix bundle (i.e. Azumaya algebra) over the multiplicative hypertoric variety and admits an explicit finite étale splitting. The global sections of this Azumaya algebra is a hypertoric quantum group, and we prove…
▽ More
We construct quantizations of multiplicative hypertoric varieties using an algebra of q-difference operators on affine space, where q is a root of unity in C. The quantization defines a matrix bundle (i.e. Azumaya algebra) over the multiplicative hypertoric variety and admits an explicit finite étale splitting. The global sections of this Azumaya algebra is a hypertoric quantum group, and we prove a localization theorem. We introduce a general framework of Frobenius quantum moment maps and their Hamiltonian reductions; our results shed light on an instance of this framework.
△ Less
Submitted 29 August, 2016; v1 submitted 22 December, 2014;
originally announced December 2014.