A Critical View of Global Optimality in Deep Learning

Yun, Chulhee; Sra, Suvrit; Jadbabaie, Ali

Computer Science > Machine Learning

arXiv:1802.03487v1 (cs)

[Submitted on 10 Feb 2018 (this version), latest version 28 May 2019 (v4)]

Title:A Critical View of Global Optimality in Deep Learning

Authors:Chulhee Yun, Suvrit Sra, Ali Jadbabaie

View PDF

Abstract:We investigate the loss surface of deep linear and nonlinear neural networks. We show that for deep linear networks with differentiable losses, critical points after the multilinear parameterization inherit the structure of critical points of the underlying loss with linear parameterization. As corollaries we obtain "local minima are global" results that subsume most previous results, while showing how to distinguish global minima from saddle points. For nonlinear neural networks, we prove two theorems showing that even for networks with one hidden layer, there can be spurious local minima. Indeed, for piecewise linear nonnegative homogeneous activations (e.g., ReLU), we prove that for almost all practical datasets there exist infinitely many local minima that are not global. We conclude by constructing a counterexample involving other activation functions (e.g., sigmoid, tanh, arctan, etc.), for which there exists a local minimum strictly inferior to the global minimum.

Comments:	35 pages
Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:1802.03487 [cs.LG]
	(or arXiv:1802.03487v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1802.03487

Submission history

From: Chulhee Yun [view email]
[v1] Sat, 10 Feb 2018 00:49:17 UTC (38 KB)
[v2] Tue, 4 Sep 2018 20:58:56 UTC (39 KB)
[v3] Fri, 28 Sep 2018 04:27:13 UTC (39 KB)
[v4] Tue, 28 May 2019 15:25:47 UTC (72 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2018-02

Change to browse by:

cs
math
math.OC
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Chulhee Yun
Suvrit Sra
Ali Jadbabaie

export BibTeX citation

Computer Science > Machine Learning

Title:A Critical View of Global Optimality in Deep Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:A Critical View of Global Optimality in Deep Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators