Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks

Sun, Shizhao; Chen, Wei; Liu, Tie-Yan

Computer Science > Distributed, Parallel, and Cluster Computing

arXiv:1606.00575v1 (cs)

[Submitted on 2 Jun 2016 (this version), latest version 18 Jul 2017 (v2)]

Title:Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks

Authors:Shizhao Sun, Wei Chen, Tie-Yan Liu

View PDF

Abstract:In recent year, parallel implementations have been used to speed up the training of deep neural networks (DNN). Typically, the parameters of the local models are periodically communicated and averaged to get a global model until the training curve converges (denoted as MA-DNN). However, since DNN is a highly non-convex model, the global model obtained by averaging parameters does not have guarantee on its performance improvement over the local models and might even be worse than the average performance of the local models, which leads to the slow-down of convergence and the decrease of the final performance. To tackle this problem, we propose a new parallel training method called \emph{Ensemble-Compression} (denoted as EC-DNN). Specifically, we propose to aggregate the local models by ensemble, i.e., the outputs of the local models are averaged instead of the parameters. Considering that the widely used loss functions are convex to the output of the model, the performance of the global model obtained in this way is guaranteed to be at least as good as the average performance of local models. However, the size of the global model will increase after each ensemble and may explode after multiple rounds of ensembles. Thus, we conduct model compression after each ensemble, to ensure the size of the global model to be the same as the local models. We conducted experiments on a benchmark dataset. The experimental results demonstrate that our proposed EC-DNN can stably achieve better performance than MA-DNN.

Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:1606.00575 [cs.DC]
	(or arXiv:1606.00575v1 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.1606.00575

Submission history

From: Shizhao Sun [view email]
[v1] Thu, 2 Jun 2016 08:10:10 UTC (59 KB)
[v2] Tue, 18 Jul 2017 08:50:05 UTC (100 KB)

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Ensemble-Compression: A New Method for Parallel Training of Deep Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators