Search | arXiv e-print repository

Estimating the Euclidean distortion of an orbit space

Authors: Ben Blum-Smith, Harm Derksen, Dustin G. Mixon, Yousef Qaddura, Brantley Vose

Abstract: Given a finite-dimensional inner product space $V$ and a group $G$ of isometries, we consider the problem of embedding the orbit space $V/G$ into a Hilbert space in a way that preserves the quotient metric as well as possible. This inquiry is motivated by applications to invariant machine learning. We introduce several new theoretical tools before using them to tackle various fundamental instances… ▽ More Given a finite-dimensional inner product space $V$ and a group $G$ of isometries, we consider the problem of embedding the orbit space $V/G$ into a Hilbert space in a way that preserves the quotient metric as well as possible. This inquiry is motivated by applications to invariant machine learning. We introduce several new theoretical tools before using them to tackle various fundamental instances of this problem. △ Less

Submitted 4 June, 2025; originally announced June 2025.

arXiv:2405.08097 [pdf, other]

A Galois theorem for machine learning: Functions on symmetric matrices and point clouds via lightweight invariant features

Authors: Ben Blum-Smith, Ningyuan Huang, Marco Cuturi, Soledad Villar

Abstract: In this work, we present a mathematical formulation for machine learning of (1) functions on symmetric matrices that are invariant with respect to the action of permutations by conjugation, and (2) functions on point clouds that are invariant with respect to rotations, reflections, and permutations of the points. To achieve this, we provide a general construction of generically separating invarian… ▽ More In this work, we present a mathematical formulation for machine learning of (1) functions on symmetric matrices that are invariant with respect to the action of permutations by conjugation, and (2) functions on point clouds that are invariant with respect to rotations, reflections, and permutations of the points. To achieve this, we provide a general construction of generically separating invariant features using ideas inspired by Galois theory. We construct $O(n^2)$ invariant features derived from generators for the field of rational functions on $n\times n$ symmetric matrices that are invariant under joint permutations of rows and columns. We show that these invariant features can separate all distinct orbits of symmetric matrices except for a measure zero set; such features can be used to universally approximate invariant functions on almost all weighted graphs. For point clouds in a fixed dimension, we prove that the number of invariant features can be reduced, generically without losing expressivity, to $O(n)$, where $n$ is the number of points. We combine these invariant features with DeepSets to learn functions on symmetric matrices and point clouds with varying sizes. We empirically demonstrate the feasibility of our approach on molecule property regression and point cloud distance prediction. △ Less

Submitted 13 February, 2025; v1 submitted 13 May, 2024; originally announced May 2024.

MSC Class: 68P01; 13A50

arXiv:2305.12585 [pdf, other]

Equivariant geometric convolutions for emulation of dynamical systems

Authors: Wilson G. Gregory, David W. Hogg, Ben Blum-Smith, Maria Teresa Arias, Kaze W. K. Wong, Soledad Villar

Abstract: Machine learning methods are increasingly being employed as surrogate models in place of computationally expensive and slow numerical integrators for a bevy of applications in the natural sciences. However, while the laws of physics are relationships between scalars, vectors, and tensors that hold regardless of the frame of reference or chosen coordinate system, surrogate machine learning models a… ▽ More Machine learning methods are increasingly being employed as surrogate models in place of computationally expensive and slow numerical integrators for a bevy of applications in the natural sciences. However, while the laws of physics are relationships between scalars, vectors, and tensors that hold regardless of the frame of reference or chosen coordinate system, surrogate machine learning models are not coordinate-free by default. We enforce coordinate freedom by using geometric convolutions in three model architectures: a ResNet, a Dilated ResNet, and a UNet. In numerical experiments emulating 2D compressible Navier-Stokes, we see better accuracy and improved stability compared to baseline surrogate models in almost all cases. The ease of enforcing coordinate freedom without making major changes to the model architecture provides an exciting recipe for any CNN-based method applied to an appropriate class of problems △ Less

Submitted 1 November, 2024; v1 submitted 21 May, 2023; originally announced May 2023.

arXiv:2209.14991 [pdf, ps, other]

doi 10.1090/noti2760

Machine learning and invariant theory

Authors: Ben Blum-Smith, Soledad Villar

Abstract: Inspired by constraints from physical law, equivariant machine learning restricts the learning to a hypothesis class where all the functions are equivariant with respect to some group action. Irreducible representations or invariant theory are typically used to parameterize the space of such functions. In this article, we introduce the topic and explain a couple of methods to explicitly parameteri… ▽ More Inspired by constraints from physical law, equivariant machine learning restricts the learning to a hypothesis class where all the functions are equivariant with respect to some group action. Irreducible representations or invariant theory are typically used to parameterize the space of such functions. In this article, we introduce the topic and explain a couple of methods to explicitly parameterize equivariant functions that are being used in machine learning applications. In particular, we explicate a general procedure, attributed to Malgrange, to express all polynomial maps between linear spaces that are equivariant under the action of a group $G$, given a characterization of the invariant polynomials on a bigger space. The method also parametrizes smooth equivariant maps in the case that $G$ is a compact Lie group. △ Less

Submitted 25 March, 2023; v1 submitted 29 September, 2022; originally announced September 2022.

Journal ref: Notices of the American Mathematical Society 70(8): 1205--1213, 2023

arXiv:2204.00887 [pdf, other]

Dimensionless machine learning: Imposing exact units equivariance

Authors: Soledad Villar, Weichi Yao, David W. Hogg, Ben Blum-Smith, Bianca Dumitrascu

Abstract: Units equivariance (or units covariance) is the exact symmetry that follows from the requirement that relationships among measured quantities of physics relevance must obey self-consistent dimensional scalings. Here, we express this symmetry in terms of a (non-compact) group action, and we employ dimensional analysis and ideas from equivariant machine learning to provide a methodology for exactly… ▽ More Units equivariance (or units covariance) is the exact symmetry that follows from the requirement that relationships among measured quantities of physics relevance must obey self-consistent dimensional scalings. Here, we express this symmetry in terms of a (non-compact) group action, and we employ dimensional analysis and ideas from equivariant machine learning to provide a methodology for exactly units-equivariant machine learning: For any given learning task, we first construct a dimensionless version of its inputs using classic results from dimensional analysis, and then perform inference in the dimensionless space. Our approach can be used to impose units equivariance across a broad range of machine learning methods which are equivariant to rotations and other groups. We discuss the in-sample and out-of-sample prediction accuracy gains one can obtain in contexts like symbolic regression and emulation, where symmetry is important. We illustrate our approach with simple numerical examples involving dynamical systems in physics and ecology. △ Less

Submitted 31 December, 2022; v1 submitted 2 April, 2022; originally announced April 2022.

Journal ref: Journal of Machine Learning Research 24 (2023) 1--32

arXiv:2106.06610 [pdf, other]

Scalars are universal: Equivariant machine learning, structured like classical physics

Authors: Soledad Villar, David W. Hogg, Kate Storey-Fisher, Weichi Yao, Ben Blum-Smith

Abstract: There has been enormous progress in the last few years in designing neural networks that respect the fundamental symmetries and coordinate freedoms of physical law. Some of these frameworks make use of irreducible representations, some make use of high-order tensor objects, and some apply symmetry-enforcing constraints. Different physical laws obey different combinations of fundamental symmetries,… ▽ More There has been enormous progress in the last few years in designing neural networks that respect the fundamental symmetries and coordinate freedoms of physical law. Some of these frameworks make use of irreducible representations, some make use of high-order tensor objects, and some apply symmetry-enforcing constraints. Different physical laws obey different combinations of fundamental symmetries, but a large fraction (possibly all) of classical physics is equivariant to translation, rotation, reflection (parity), boost (relativity), and permutations. Here we show that it is simple to parameterize universally approximating polynomial functions that are equivariant under these symmetries, or under the Euclidean, Lorentz, and Poincaré groups, at any dimensionality $d$. The key observation is that nonlinear O($d$)-equivariant (and related-group-equivariant) functions can be universally expressed in terms of a lightweight collection of scalars -- scalar products and scalar contractions of the scalar, vector, and tensor inputs. We complement our theory with numerical examples that show that the scalar-based method is simple, efficient, and scalable. △ Less

Submitted 7 February, 2023; v1 submitted 11 June, 2021; originally announced June 2021.

Comments: NeurIPS 2021

Journal ref: Advances in Neural Information Processing Systems, 34, 28848-28863. 2021

arXiv:1712.10163 [pdf, ps, other]

doi 10.1016/j.acha.2023.06.001

Estimation under group actions: recovering orbits from invariants

Authors: Afonso S. Bandeira, Ben Blum-Smith, Joe Kileel, Amelia Perry, Jonathan Niles-Weed, Alexander S. Wein

Abstract: We study a class of orbit recovery problems in which we observe independent copies of an unknown element of $\mathbb{R}^p$, each linearly acted upon by a random element of some group (such as $\mathbb{Z}/p$ or $\mathrm{SO}(3)$) and then corrupted by additive Gaussian noise. We prove matching upper and lower bounds on the number of samples required to approximately recover the group orbit of this u… ▽ More We study a class of orbit recovery problems in which we observe independent copies of an unknown element of $\mathbb{R}^p$, each linearly acted upon by a random element of some group (such as $\mathbb{Z}/p$ or $\mathrm{SO}(3)$) and then corrupted by additive Gaussian noise. We prove matching upper and lower bounds on the number of samples required to approximately recover the group orbit of this unknown element with high probability. These bounds, based on quantitative techniques in invariant theory, give a precise correspondence between the statistical difficulty of the estimation problem and algebraic properties of the group. Furthermore, we give computer-assisted procedures to certify these properties that are computationally efficient in many cases of interest. The model is motivated by geometric problems in signal processing, computer vision, and structural biology, and applies to the reconstruction problem in cryo-electron microscopy (cryo-EM), a problem of significant practical interest. Our results allow us to verify (for a given problem size) that if cryo-EM images are corrupted by noise with variance $σ^2$, the number of images required to recover the molecule structure scales as $σ^6$. We match this bound with a novel (albeit computationally expensive) algorithm for ab initio reconstruction in cryo-EM, based on invariant features of degree at most 3. We further discuss how to recover multiple molecular structures from mixed (or heterogeneous) cryo-EM samples. △ Less

Submitted 13 June, 2023; v1 submitted 29 December, 2017; originally announced December 2017.

Comments: 81 pages. Minor revisions since previous version, reflecting peer review feedback. To be published in Applied and Computational Harmonic Analysis

MSC Class: 62F10; 92C55; 16W22

Journal ref: Applied and Computational Harmonic Analysis 66 (2023) 236--319

Showing 1–7 of 7 results for author: Blum-Smith, B