On the equivalence of different adaptive batch size selection strategies for stochastic gradient descent methods

Espath, Luis; Krumscheid, Sebastian; Tempone, Raúl; Vilanova, Pedro

Mathematics > Optimization and Control

arXiv:2109.10933 (math)

[Submitted on 22 Sep 2021 (v1), last revised 4 Jul 2023 (this version, v2)]

Title:On the equivalence of different adaptive batch size selection strategies for stochastic gradient descent methods

Authors:Luis Espath, Sebastian Krumscheid, Raúl Tempone, Pedro Vilanova

View PDF

Abstract:In this study, we demonstrate that the norm test and inner product/orthogonality test presented in \cite{Bol18} are equivalent in terms of the convergence rates associated with Stochastic Gradient Descent (SGD) methods if $\epsilon^2=\theta^2+\nu^2$ with specific choices of $\theta$ and $\nu$. Here, $\epsilon$ controls the relative statistical error of the norm of the gradient while $\theta$ and $\nu$ control the relative statistical error of the gradient in the direction of the gradient and in the direction orthogonal to the gradient, respectively. Furthermore, we demonstrate that the inner product/orthogonality test can be as inexpensive as the norm test in the best case scenario if $\theta$ and $\nu$ are optimally selected, but the inner product/orthogonality test will never be more computationally affordable than the norm test if $\epsilon^2=\theta^2+\nu^2$. Finally, we present two stochastic optimization problems to illustrate our results.

Subjects:	Optimization and Control (math.OC); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2109.10933 [math.OC]
	(or arXiv:2109.10933v2 [math.OC] for this version)
	https://doi.org/10.48550/arXiv.2109.10933

Submission history

From: Luis Espath [view email]
[v1] Wed, 22 Sep 2021 18:01:15 UTC (159 KB)
[v2] Tue, 4 Jul 2023 10:20:34 UTC (159 KB)

Mathematics > Optimization and Control

Title:On the equivalence of different adaptive batch size selection strategies for stochastic gradient descent methods

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Optimization and Control

Title:On the equivalence of different adaptive batch size selection strategies for stochastic gradient descent methods

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators