Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method

Sun, Xu; Ren, Xuancheng; Ma, Shuming; Wei, Bingzhen; Li, Wei; Xu, Jingjing; Wang, Houfeng; Zhang, Yi

doi:10.1109/TKDE.2018.2883613

Computer Science > Machine Learning

arXiv:1711.06528 (cs)

[Submitted on 17 Nov 2017 (v1), last revised 12 Mar 2019 (this version, v2)]

Title:Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method

Authors:Xu Sun, Xuancheng Ren, Shuming Ma, Bingzhen Wei, Wei Li, Jingjing Xu, Houfeng Wang, Yi Zhang

View PDF

Abstract:We propose a simple yet effective technique to simplify the training and the resulting model of neural networks. In back propagation, only a small subset of the full gradient is computed to update the model parameters. The gradient vectors are sparsified in such a way that only the top-k elements (in terms of magnitude) are kept. As a result, only k rows or columns (depending on the layout) of the weight matrix are modified, leading to a linear reduction in the computational cost. Based on the sparsified gradients, we further simplify the model by eliminating the rows or columns that are seldom updated, which will reduce the computational cost both in the training and decoding, and potentially accelerate decoding in real-world applications. Surprisingly, experimental results demonstrate that most of time we only need to update fewer than 5% of the weights at each back propagation pass. More interestingly, the accuracy of the resulting models is actually improved rather than degraded, and a detailed analysis is given. The model simplification results show that we could adaptively simplify the model which could often be reduced by around 9x, without any loss on accuracy or even with improved accuracy. The codes, including the extension, are available at this https URL

Comments:	14 pages, 4 figures, 13 tables, accepted for publication in IEEE TKDE; this article supersedes arXiv:1706.06197
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1711.06528 [cs.LG]
	(or arXiv:1711.06528v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1711.06528
Related DOI:	https://doi.org/10.1109/TKDE.2018.2883613

Submission history

From: Xu Sun [view email]
[v1] Fri, 17 Nov 2017 13:36:51 UTC (349 KB)
[v2] Tue, 12 Mar 2019 01:22:44 UTC (498 KB)

Computer Science > Machine Learning

Title:Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Training Simplification and Model Simplification for Deep Learning: A Minimal Effort Back Propagation Method

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators