$L_0$-ARM: Network Sparsification via Stochastic Binary Optimization

Li, Yang; Ji, Shihao

doi:10.1007/978-3-030-46147-8_26

Computer Science > Machine Learning

arXiv:1904.04432 (cs)

[Submitted on 9 Apr 2019 (v1), last revised 11 Sep 2019 (this version, v3)]

Title:$L_0$-ARM: Network Sparsification via Stochastic Binary Optimization

Authors:Yang Li, Shihao Ji

View PDF

Abstract:We consider network sparsification as an $L_0$-norm regularized binary optimization problem, where each unit of a neural network (e.g., weight, neuron, or channel, etc.) is attached with a stochastic binary gate, whose parameters are jointly optimized with original network parameters. The Augment-Reinforce-Merge (ARM), a recently proposed unbiased gradient estimator, is investigated for this binary optimization problem. Compared to the hard concrete gradient estimator from Louizos et al., ARM demonstrates superior performance of pruning network architectures while retaining almost the same accuracies of baseline methods. Similar to the hard concrete estimator, ARM also enables conditional computation during model training but with improved effectiveness due to the exact binary stochasticity. Thanks to the flexibility of ARM, many smooth or non-smooth parametric functions, such as scaled sigmoid or hard sigmoid, can be used to parameterize this binary optimization problem and the unbiasness of the ARM estimator is retained, while the hard concrete estimator has to rely on the hard sigmoid function to achieve conditional computation and thus accelerated training. Extensive experiments on multiple public datasets demonstrate state-of-the-art pruning rates with almost the same accuracies of baseline methods. The resulting algorithm $L_0$-ARM sparsifies the Wide-ResNet models on CIFAR-10 and CIFAR-100 while the hard concrete estimator cannot. The code is public available at this https URL.

Comments:	Published as a conference paper at ECML 2019
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1904.04432 [cs.LG]
	(or arXiv:1904.04432v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1904.04432
Related DOI:	https://doi.org/10.1007/978-3-030-46147-8_26

Submission history

From: Yang Li [view email]
[v1] Tue, 9 Apr 2019 02:43:31 UTC (330 KB)
[v2] Sat, 7 Sep 2019 18:08:27 UTC (399 KB)
[v3] Wed, 11 Sep 2019 13:38:21 UTC (399 KB)

Computer Science > Machine Learning

Title:$L_0$-ARM: Network Sparsification via Stochastic Binary Optimization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:$L_0$-ARM: Network Sparsification via Stochastic Binary Optimization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators