-
arXiv:2505.11846 [pdf, ps, other]
Learning on a Razor's Edge: the Singularity Bias of Polynomial Neural Networks
Abstract: Deep neural networks often infer sparse representations, converging to a subnetwork during the learning process. In this work, we theoretically analyze subnetworks and their bias through the lens of algebraic geometry. We consider fully-connected networks with polynomial activation functions, and focus on the geometry of the function space they parametrize, often referred to as neuromanifold. Firs… ▽ More
Submitted 17 May, 2025; originally announced May 2025.
-
Rubik's Abstract Polytopes
Abstract: We generalize the Rubik's cube, together with its group of configurations, to any abstract regular polytope. After discussing general aspects, we study the Rubik's simplex of arbitrary dimension and provide a complete description of the associated group. We sketch an analogous argument for the Rubik's hypercube as well.
Submitted 19 February, 2025; originally announced February 2025.
-
arXiv:2501.18915 [pdf, ps, other]
Algebra Unveils Deep Learning -- An Invitation to Neuroalgebraic Geometry
Abstract: In this position paper, we promote the study of function spaces parameterized by machine learning models through the lens of algebraic geometry. To this end, we focus on algebraic models, such as neural networks with polynomial activations, whose associated function spaces are semi-algebraic varieties. We outline a dictionary between algebro-geometric invariants of these varieties, such as dimensi… ▽ More
Submitted 30 May, 2025; v1 submitted 31 January, 2025; originally announced January 2025.
Comments: Published at ICML 2025
-
On the Geometry and Optimization of Polynomial Convolutional Networks
Abstract: We study convolutional neural networks with monomial activation functions. Specifically, we prove that their parameterization map is regular and is an isomorphism almost everywhere, up to rescaling the filters. By leveraging on tools from algebraic geometry, we explore the geometric properties of the image in function space of this map - typically referred to as neuromanifold. In particular, we co… ▽ More
Submitted 3 March, 2025; v1 submitted 1 October, 2024; originally announced October 2024.
Comments: Accepted at AISTATS 2025
-
Geometry of Lightning Self-Attention: Identifiability and Dimension
Abstract: We consider function spaces defined by self-attention networks without normalization, and theoretically analyze their geometry. Since these networks are polynomial, we rely on tools from algebraic geometry. In particular, we study the identifiability of deep attention by providing a description of the generic fibers of the parametrization for an arbitrary number of layers and, as a consequence, co… ▽ More
Submitted 19 February, 2025; v1 submitted 30 August, 2024; originally announced August 2024.
Comments: Accepted at ICLR 2025
-
Equivariant Representation Learning in the Presence of Stabilizers
Abstract: We introduce Equivariant Isomorphic Networks (EquIN) -- a method for learning representations that are equivariant with respect to general group actions over data. Differently from existing equivariant representation learners, EquIN is suitable for group actions that are not free, i.e., that stabilize data via nontrivial symmetries. EquIN is theoretically grounded in the orbit-stabilizer theorem f… ▽ More
Submitted 16 September, 2023; v1 submitted 12 January, 2023; originally announced January 2023.
Comments: NeurIPS Workshop on Symmetry and Geometry in Neural Representations (v1), European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (v2)
-
Equivariant Representation Learning via Class-Pose Decomposition
Abstract: We introduce a general method for learning representations that are equivariant to symmetries of data. Our central idea is to decompose the latent space into an invariant factor and the symmetry group itself. The components semantically correspond to intrinsic data classes and poses respectively. The learner is trained on a loss encouraging equivariance based on supervision from relative symmetry… ▽ More
Submitted 7 February, 2023; v1 submitted 7 July, 2022; originally announced July 2022.
Comments: 12 pages
-
arXiv:1806.00883 [pdf, ps, other]
The (he)art of gluing
Abstract: We introduce a notion of gluability for poset-indexed Bridgeland slicings on triangulated categories and show how a gluing abelian slicing on the heart of a bounded $t$-structure naturally induces a family of perverse $t$-structures. Our setup generalises the one of Collins and Polishchuk. As a corollary we recover several constructions from the theory of $t$-structures on triangulated categories.… ▽ More
Submitted 3 June, 2018; originally announced June 2018.
Comments: 28 pages
-
Palindromic Bernoulli distributions
Abstract: We introduce and study a subclass of joint Bernoulli distributions which has the palindromic property. For such distributions the vector of joint probabilities is unchanged when the order of the elements is reversed. We prove for binary variables that the palindromic property is equivalent to zero constraints on all odd-order interaction parameters, be it in parameterizations which are log-linear,… ▽ More
Submitted 5 May, 2016; v1 submitted 30 October, 2015; originally announced October 2015.
Comments: 17 pages, 1 figure, 5 tables
-
arXiv:1501.04658 [pdf, ps, other]
Hearts and towers in stable infinity-categories
Abstract: We exploit the equivalence between $t$-structures and normal torsion theories on a stable $\infty$-category to show how a few classical topics in the theory of triangulated categories, i.e., the characterization of bounded $t$-structures in terms of their hearts, their associated cohomology functors, semiorthogonal decompositions, and the theory of tiltings, as well as the more recent notion of Br… ▽ More
Submitted 1 May, 2019; v1 submitted 19 January, 2015; originally announced January 2015.
Comments: Final version, ready for printing, accepted on JHRS
MSC Class: 18E30; 18A32; 18E35; 16E35
-
arXiv:0906.2098 [pdf, ps, other]
Chain graph models of multivariate regression type for categorical data
Abstract: We discuss a class of chain graph models for categorical variables defined by what we call a multivariate regression chain graph Markov property. First, the set of local independencies of these models is shown to be Markov equivalent to those of a chain graph model recently defined in the literature. Next we provide a parametrization based on a sequence of generalized linear models with a multivar… ▽ More
Submitted 13 July, 2011; v1 submitted 11 June, 2009; originally announced June 2009.
Comments: Published in at http://dx.doi.org/10.3150/10-BEJ300 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)
Report number: IMS-BEJ-BEJ300
Journal ref: Bernoulli 2011, Vol. 17, No. 3, 827-844
-
arXiv:0904.0333 [pdf, ps, other]
Matrix representations and independencies in directed acyclic graphs
Abstract: For a directed acyclic graph, there are two known criteria to decide whether any specific conditional independence statement is implied for all distributions factorized according to the given graph. Both criteria are based on special types of path in graphs. They are called separation criteria because independence holds whenever the conditioning set is a separating set in a graph theoretical sen… ▽ More
Submitted 2 April, 2009; originally announced April 2009.
Comments: Published in at http://dx.doi.org/10.1214/08-AOS594 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Report number: IMS-AOS-AOS594 MSC Class: 62H99 (Primary) 62H05; 05C50 (Secondary)
Journal ref: Annals of Statistics 2009, Vol. 37, No. 2, 961-978
-
arXiv:0801.1440 [pdf, ps, other]
Parameterizations and fitting of bi-directed graph models to categorical data
Abstract: We discuss two parameterizations of models for marginal independencies for discrete distributions which are representable by bi-directed graph models, under the global Markov property. Such models are useful data analytic tools especially if used in combination with other graphical models. The first parameterization, in the saturated case, is also known as the multivariate logistic transformatio… ▽ More
Submitted 9 January, 2008; originally announced January 2008.