On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization

Chen, Xiangyi; Liu, Sijia; Sun, Ruoyu; Hong, Mingyi

Computer Science > Machine Learning

arXiv:1808.02941 (cs)

[Submitted on 8 Aug 2018 (v1), last revised 10 Mar 2019 (this version, v2)]

Title:On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization

Authors:Xiangyi Chen, Sijia Liu, Ruoyu Sun, Mingyi Hong

View PDF

Abstract:This paper studies a class of adaptive gradient based momentum algorithms that update the search directions and learning rates simultaneously using past gradients. This class, which we refer to as the "Adam-type", includes the popular algorithms such as the Adam, AMSGrad and AdaGrad. Despite their popularity in training deep neural networks, the convergence of these algorithms for solving nonconvex problems remains an open question. This paper provides a set of mild sufficient conditions that guarantee the convergence for the Adam-type methods. We prove that under our derived conditions, these methods can achieve the convergence rate of order $O(\log{T}/\sqrt{T})$ for nonconvex stochastic optimization. We show the conditions are essential in the sense that violating them may make the algorithm diverge. Moreover, we propose and analyze a class of (deterministic) incremental adaptive gradient algorithms, which has the same $O(\log{T}/\sqrt{T})$ convergence rate. Our study could also be extended to a broader class of adaptive gradient methods in machine learning and optimization.

Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:1808.02941 [cs.LG]
	(or arXiv:1808.02941v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1808.02941

Submission history

From: Xiangyi Chen [view email]
[v1] Wed, 8 Aug 2018 21:14:07 UTC (1,620 KB)
[v2] Sun, 10 Mar 2019 00:48:35 UTC (1,932 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2018-08

Change to browse by:

cs
math
math.OC
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Xiangyi Chen
Sijia Liu
Ruoyu Sun
Mingyi Hong

export BibTeX citation

Computer Science > Machine Learning

Title:On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators