Learning Symmetries via Weight-Sharing with Doubly Stochastic Tensors

van der Linden, Putri A.; García-Castellanos, Alejandro; Vadgama, Sharvaree; Kuipers, Thijs P.; Bekkers, Erik J.

Computer Science > Machine Learning

arXiv:2412.04594 (cs)

[Submitted on 5 Dec 2024 (v1), last revised 14 Jan 2025 (this version, v2)]

Title:Learning Symmetries via Weight-Sharing with Doubly Stochastic Tensors

Authors:Putri A. van der Linden, Alejandro García-Castellanos, Sharvaree Vadgama, Thijs P. Kuipers, Erik J. Bekkers

View PDF HTML (experimental)

Abstract:Group equivariance has emerged as a valuable inductive bias in deep learning, enhancing generalization, data efficiency, and robustness. Classically, group equivariant methods require the groups of interest to be known beforehand, which may not be realistic for real-world data. Additionally, baking in fixed group equivariance may impose overly restrictive constraints on model architecture. This highlights the need for methods that can dynamically discover and apply symmetries as soft constraints. For neural network architectures, equivariance is commonly achieved through group transformations of a canonical weight tensor, resulting in weight sharing over a given group $G$. In this work, we propose to learn such a weight-sharing scheme by defining a collection of learnable doubly stochastic matrices that act as soft permutation matrices on canonical weight tensors, which can take regular group representations as a special case. This yields learnable kernel transformations that are jointly optimized with downstream tasks. We show that when the dataset exhibits strong symmetries, the permutation matrices will converge to regular group representations and our weight-sharing networks effectively become regular group convolutions. Additionally, the flexibility of the method enables it to effectively pick up on partial symmetries.

Comments:	19 pages, 14 figures, 4 tables
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2412.04594 [cs.LG]
	(or arXiv:2412.04594v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2412.04594
Journal reference:	Advances in Neural Information Processing Systems (NeurIPS) 2024

Submission history

From: Putri Van Der Linden [view email]
[v1] Thu, 5 Dec 2024 20:15:34 UTC (2,326 KB)
[v2] Tue, 14 Jan 2025 11:03:05 UTC (2,327 KB)

Computer Science > Machine Learning

Title:Learning Symmetries via Weight-Sharing with Doubly Stochastic Tensors

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learning Symmetries via Weight-Sharing with Doubly Stochastic Tensors

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators