Deep Neural Network Capacity

Wang, Aosen; Zhou, Hua; Xu, Wenyao; Chen, Xin

Computer Science > Computer Vision and Pattern Recognition

arXiv:1708.05029 (cs)

This paper has been withdrawn by Xin Chen

[Submitted on 16 Aug 2017 (v1), last revised 18 Feb 2018 (this version, v3)]

Title:Deep Neural Network Capacity

Authors:Aosen Wang, Hua Zhou, Wenyao Xu, Xin Chen

No PDF available, click to view other formats

Abstract:In recent years, deep neural network exhibits its powerful superiority on information discrimination in many computer vision applications. However, the capacity of deep neural network architecture is still a mystery to the researchers. Intuitively, larger capacity of neural network can always deposit more information to improve the discrimination ability of the model. But, the learnable parameter scale is not feasible to estimate the capacity of deep neural network. Due to the overfitting, directly increasing hidden nodes number and hidden layer number are already demonstrated not necessary to effectively increase the network discrimination ability.
In this paper, we propose a novel measurement, named "total valid bits", to evaluate the capacity of deep neural networks for exploring how to quantitatively understand the deep learning and the insights behind its super performance. Specifically, our scheme to retrieve the total valid bits incorporates the skilled techniques in both training phase and inference phase. In the network training, we design decimal weight regularization and 8-bit forward quantization to obtain the integer-oriented network representations. Moreover, we develop adaptive-bitwidth and non-uniform quantization strategy in the inference phase to find the neural network capacity, total valid bits. By allowing zero bitwidth, our adaptive-bitwidth quantization can execute the model reduction and valid bits finding simultaneously. In our extensive experiments, we first demonstrate that our total valid bits is a good indicator of neural network capacity. We also analyze the impact on network capacity from the network architecture and advanced training skills, such as dropout and batch normalization.

Comments:	There is an error in Average Valid Bits computation in figure 1 in page 2
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1708.05029 [cs.CV]
	(or arXiv:1708.05029v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1708.05029

Submission history

From: Xin Chen [view email]
[v1] Wed, 16 Aug 2017 18:28:22 UTC (185 KB)
[v2] Tue, 3 Oct 2017 18:59:27 UTC (1 KB) (withdrawn)
[v3] Sun, 18 Feb 2018 18:42:40 UTC (1 KB) (withdrawn)

Computer Science > Computer Vision and Pattern Recognition

Title:Deep Neural Network Capacity

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Deep Neural Network Capacity

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators