Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

Yin, Shihui; Srivastava, Gaurav; Venkataramanaiah, Shreyas K.; Chakrabarti, Chaitali; Berisha, Visar; Seo, Jae-sun

Computer Science > Neural and Evolutionary Computing

arXiv:1804.07370 (cs)

[Submitted on 19 Apr 2018]

Title:Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

Authors:Shihui Yin, Gaurav Srivastava, Shreyas K. Venkataramanaiah, Chaitali Chakrabarti, Visar Berisha, Jae-sun Seo

View PDF

Abstract:Deep learning algorithms have shown tremendous success in many recognition tasks; however, these algorithms typically include a deep neural network (DNN) structure and a large number of parameters, which makes it challenging to implement them on power/area-constrained embedded platforms. To reduce the network size, several studies investigated compression by introducing element-wise or row-/column-/block-wise sparsity via pruning and regularization. In addition, many recent works have focused on reducing precision of activations and weights with some reducing down to a single bit. However, combining various sparsity structures with binarized or very-low-precision (2-3 bit) neural networks have not been comprehensively explored. In this work, we present design techniques for minimum-area/-energy DNN hardware with minimal degradation in accuracy. During training, both binarization/low-precision and structured sparsity are applied as constraints to find the smallest memory footprint for a given deep learning algorithm. The DNN model for CIFAR-10 dataset with weight memory reduction of 50X exhibits accuracy comparable to that of the floating-point counterpart. Area, performance and energy results of DNN hardware in 40nm CMOS are reported for the MNIST dataset. The optimized DNN that combines 8X structured compression and 3-bit weight precision showed 98.4% accuracy at 20nJ per classification.

Comments:	2017 Asilomar Conference on Signals, Systems and Computers
Subjects:	Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:1804.07370 [cs.NE]
	(or arXiv:1804.07370v1 [cs.NE] for this version)
	https://doi.org/10.48550/arXiv.1804.07370

Submission history

From: Jae-sun Seo [view email]
[v1] Thu, 19 Apr 2018 20:32:04 UTC (1,422 KB)

Computer Science > Neural and Evolutionary Computing

Title:Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Neural and Evolutionary Computing

Title:Minimizing Area and Energy of Deep Learning Hardware Design Using Collective Low Precision and Structured Compression

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators