Experiments with Rich Regime Training for Deep Learning

Li, Xinyan; Banerjee, Arindam

Computer Science > Machine Learning

arXiv:2102.13522 (cs)

[Submitted on 26 Feb 2021]

Title:Experiments with Rich Regime Training for Deep Learning

Authors:Xinyan Li, Arindam Banerjee

View PDF

Abstract:In spite of advances in understanding lazy training, recent work attributes the practical success of deep learning to the rich regime with complex inductive bias. In this paper, we study rich regime training empirically with benchmark datasets, and find that while most parameters are lazy, there is always a small number of active parameters which change quite a bit during training. We show that re-initializing (resetting to their initial random values) the active parameters leads to worse generalization. Further, we show that most of the active parameters are in the bottom layers, close to the input, especially as the networks become wider. Based on such observations, we study static Layer-Wise Sparse (LWS) SGD, which only updates some subsets of layers. We find that only updating the top and bottom layers have good generalization and, as expected, only updating the top layers yields a fast algorithm. Inspired by this, we investigate probabilistic LWS-SGD, which mostly updates the top layers and occasionally updates the full network. We show that probabilistic LWS-SGD matches the generalization performance of vanilla SGD and the back-propagation time can be 2-5 times more efficient.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2102.13522 [cs.LG]
	(or arXiv:2102.13522v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2102.13522

Submission history

From: Xinyan Li [view email]
[v1] Fri, 26 Feb 2021 14:49:28 UTC (9,699 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2021-02

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Xinyan Li
Arindam Banerjee

export BibTeX citation

Computer Science > Machine Learning

Title:Experiments with Rich Regime Training for Deep Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Experiments with Rich Regime Training for Deep Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators