Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond

Yun, Chulhee; Rajput, Shashank; Sra, Suvrit

Computer Science > Machine Learning

arXiv:2110.10342 (cs)

[Submitted on 20 Oct 2021 (v1), last revised 23 Mar 2022 (this version, v2)]

Title:Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond

Authors:Chulhee Yun, Shashank Rajput, Suvrit Sra

View PDF

Abstract:In distributed learning, local SGD (also known as federated averaging) and its simple baseline minibatch SGD are widely studied optimization methods. Most existing analyses of these methods assume independent and unbiased gradient estimates obtained via with-replacement sampling. In contrast, we study shuffling-based variants: minibatch and local Random Reshuffling, which draw stochastic gradients without replacement and are thus closer to practice. For smooth functions satisfying the Polyak-Łojasiewicz condition, we obtain convergence bounds (in the large epoch regime) which show that these shuffling-based variants converge faster than their with-replacement counterparts. Moreover, we prove matching lower bounds showing that our convergence analysis is tight. Finally, we propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.

Comments:	ICLR 2022 camera-ready (selected for an oral presentation); 76 pages, 3 figures
Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:2110.10342 [cs.LG]
	(or arXiv:2110.10342v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2110.10342

Submission history

From: Chulhee Yun [view email]
[v1] Wed, 20 Oct 2021 02:25:25 UTC (87 KB)
[v2] Wed, 23 Mar 2022 15:13:50 UTC (153 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2021-10

Change to browse by:

cs
math
math.OC
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Chulhee Yun
Shashank Rajput
Suvrit Sra

export BibTeX citation

Computer Science > Machine Learning

Title:Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators