Teaching Transformers Modular Arithmetic at Scale

Saxena, Eshika; Alfarano, Alberto; Wenger, Emily; Lauter, Kristin

Computer Science > Machine Learning

arXiv:2410.03569 (cs)

[Submitted on 4 Oct 2024]

Title:Teaching Transformers Modular Arithmetic at Scale

Authors:Eshika Saxena, Alberto Alfarano, Emily Wenger, Kristin Lauter

View PDF

Abstract:Modular addition is, on its face, a simple operation: given $N$ elements in $\mathbb{Z}_q$, compute their sum modulo $q$. Yet, scalable machine learning solutions to this problem remain elusive: prior work trains ML models that sum $N \le 6$ elements mod $q \le 1000$. Promising applications of ML models for cryptanalysis-which often involve modular arithmetic with large $N$ and $q$-motivate reconsideration of this problem. This work proposes three changes to the modular addition model training pipeline: more diverse training data, an angular embedding, and a custom loss function. With these changes, we demonstrate success with our approach for $N = 256, q = 3329$, a case which is interesting for cryptographic applications, and a significant increase in $N$ and $q$ over prior work. These techniques also generalize to other modular arithmetic problems, motivating future work.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2410.03569 [cs.LG]
	(or arXiv:2410.03569v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2410.03569

Submission history

From: Eshika Saxena [view email]
[v1] Fri, 4 Oct 2024 16:19:33 UTC (42,694 KB)

Computer Science > Machine Learning

Title:Teaching Transformers Modular Arithmetic at Scale

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Teaching Transformers Modular Arithmetic at Scale

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators