Accelerating Machine Learning Systems via Category Theory: Applications to Spherical Attention for Gene Regulatory Networks

Abbott, Vincent; Kamiya, Kotaro; Glowacki, Gerard; Atsumi, Yu; Zardini, Gioele; Maruyama, Yoshihiro

Mathematics > Category Theory

arXiv:2505.09326 (math)

[Submitted on 14 May 2025]

Title:Accelerating Machine Learning Systems via Category Theory: Applications to Spherical Attention for Gene Regulatory Networks

Authors:Vincent Abbott, Kotaro Kamiya, Gerard Glowacki, Yu Atsumi, Gioele Zardini, Yoshihiro Maruyama

View PDF HTML (experimental)

Abstract:How do we enable artificial intelligence models to improve themselves? This is central to exponentially improving generalized artificial intelligence models, which can improve their own architecture to handle new problem domains in an efficient manner that leverages the latest hardware. However, current automated compilation methods are poor, and efficient algorithms require years of human development. In this paper, we use neural circuit diagrams, based in category theory, to prove a general theorem related to deep learning algorithms, guide the development of a novel attention algorithm catered to the domain of gene regulatory networks, and produce a corresponding efficient kernel. The algorithm we propose, spherical attention, shows that neural circuit diagrams enable a principled and systematic method for reasoning about deep learning architectures and providing high-performance code. By replacing SoftMax with an $L^2$ norm as suggested by diagrams, it overcomes the special function unit bottleneck of standard attention while retaining the streaming property essential to high-performance. Our diagrammatically derived \textit{FlashSign} kernel achieves comparable performance to the state-of-the-art, fine-tuned FlashAttention algorithm on an A100, and $3.6\times$ the performance of PyTorch. Overall, this investigation shows neural circuit diagrams' suitability as a high-level framework for the automated development of efficient, novel artificial intelligence architectures.

Subjects:	Category Theory (math.CT); Machine Learning (cs.LG); Molecular Networks (q-bio.MN)
Cite as:	arXiv:2505.09326 [math.CT]
	(or arXiv:2505.09326v1 [math.CT] for this version)
	https://doi.org/10.48550/arXiv.2505.09326

Submission history

From: Gerard Glowacki [view email]
[v1] Wed, 14 May 2025 12:24:22 UTC (1,491 KB)

Mathematics > Category Theory

Title:Accelerating Machine Learning Systems via Category Theory: Applications to Spherical Attention for Gene Regulatory Networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Category Theory

Title:Accelerating Machine Learning Systems via Category Theory: Applications to Spherical Attention for Gene Regulatory Networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators