Generalization Error of First-Order Methods for Statistical Learning with Generic Oracles

Scaman, Kevin; Even, Mathieu; Massoulié, Laurent

Computer Science > Machine Learning

arXiv:2307.04679v2 (cs)

[Submitted on 10 Jul 2023 (v1), revised 11 Jul 2023 (this version, v2), latest version 1 Jul 2024 (v3)]

Title:Generalization Error of First-Order Methods for Statistical Learning with Generic Oracles

Authors:Kevin Scaman, Mathieu Even, Laurent Massoulié

View PDF

Abstract:In this paper, we provide a novel framework for the analysis of generalization error of first-order optimization algorithms for statistical learning when the gradient can only be accessed through partial observations given by an oracle. Our analysis relies on the regularity of the gradient w.r.t. the data samples, and allows to derive near matching upper and lower bounds for the generalization error of multiple learning problems, including supervised learning, transfer learning, robust learning, distributed learning and communication efficient learning using gradient quantization. These results hold for smooth and strongly-convex optimization problems, as well as smooth non-convex optimization problems verifying a Polyak-Lojasiewicz assumption. In particular, our upper and lower bounds depend on a novel quantity that extends the notion of conditional standard deviation, and is a measure of the extent to which the gradient can be approximated by having access to the oracle. As a consequence, our analysis provides a precise meaning to the intuition that optimization of the statistical learning objective is as hard as the estimation of its gradient. Finally, we show that, in the case of standard supervised learning, mini-batch gradient descent with increasing batch sizes and a warm start can reach a generalization error that is optimal up to a multiplicative factor, thus motivating the use of this optimization scheme in practical applications.

Comments:	18 pages, 0 figures
Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as:	arXiv:2307.04679 [cs.LG]
	(or arXiv:2307.04679v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2307.04679

Submission history

From: Kevin Scaman [view email]
[v1] Mon, 10 Jul 2023 16:29:05 UTC (25 KB)
[v2] Tue, 11 Jul 2023 10:12:26 UTC (25 KB)
[v3] Mon, 1 Jul 2024 11:44:15 UTC (135 KB)

Computer Science > Machine Learning

Title:Generalization Error of First-Order Methods for Statistical Learning with Generic Oracles

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Generalization Error of First-Order Methods for Statistical Learning with Generic Oracles

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators