Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability

Wu, Jingfeng; Braverman, Vladimir; Lee, Jason D.

Computer Science > Machine Learning

arXiv:2305.11788 (cs)

[Submitted on 19 May 2023 (v1), last revised 15 Oct 2023 (this version, v2)]

Title:Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability

Authors:Jingfeng Wu, Vladimir Braverman, Jason D. Lee

View PDF

Abstract:Recent research has observed that in machine learning optimization, gradient descent (GD) often operates at the edge of stability (EoS) [Cohen, et al., 2021], where the stepsizes are set to be large, resulting in non-monotonic losses induced by the GD iterates. This paper studies the convergence and implicit bias of constant-stepsize GD for logistic regression on linearly separable data in the EoS regime. Despite the presence of local oscillations, we prove that the logistic loss can be minimized by GD with \emph{any} constant stepsize over a long time scale. Furthermore, we prove that with \emph{any} constant stepsize, the GD iterates tend to infinity when projected to a max-margin direction (the hard-margin SVM direction) and converge to a fixed vector that minimizes a strongly convex potential when projected to the orthogonal complement of the max-margin direction. In contrast, we also show that in the EoS regime, GD iterates may diverge catastrophically under the exponential loss, highlighting the superiority of the logistic loss. These theoretical findings are in line with numerical simulations and complement existing theories on the convergence and implicit bias of GD for logistic regression, which are only applicable when the stepsizes are sufficiently small.

Comments:	NeurIPS 2023 camera ready version
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2305.11788 [cs.LG]
	(or arXiv:2305.11788v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2305.11788

Submission history

From: Jingfeng Wu [view email]
[v1] Fri, 19 May 2023 16:24:47 UTC (63 KB)
[v2] Sun, 15 Oct 2023 17:53:26 UTC (91 KB)

Computer Science > Machine Learning

Title:Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators