On layer-level control of DNN training and its impact on generalization

Carbonnelle, Simon; De Vleeschouwer, Christophe

Abstract:The generalization ability of a neural network depends on the optimization procedure used for training it. For practitioners and theoreticians, it is essential to identify which properties of the optimization procedure influence generalization. In this paper, we observe that prioritizing the training of distinct layers in a network significantly impacts its generalization ability, sometimes causing differences of up to 30% in test accuracy. In order to better monitor and control such prioritization, we propose to define layer-level training speed as the rotation rate of the layer's weight vector (denoted by layer rotation rate hereafter), and develop Layca, an optimization algorithm that enables direct control over it through each layer's learning rate parameter, without being affected by gradient propagation phenomena (e.g. vanishing gradients). We show that controlling layer rotation rates enables Layca to significantly outperform SGD with the same amount of learning rate tuning on three different tasks (up to 10% test error improvement). Furthermore, we provide experiments that suggest that several intriguing observations related to the training of deep models, i.e. the presence of plateaus in learning curves, the impact of weight decay, and the bad generalization properties of adaptive gradient methods, are all due to specific configurations of layer rotation rates. Overall, our work reveals that layer rotation rates are an important factor for generalization, and that monitoring it should be a key component of any deep learning experiment.

Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as:	arXiv:1806.01603 [cs.LG]
	(or arXiv:1806.01603v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1806.01603

Computer Science > Machine Learning

Title:On layer-level control of DNN training and its impact on generalization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators