Greedy Layerwise Learning Can Scale to ImageNet

Belilovsky, Eugene; Eickenberg, Michael; Oyallon, Edouard

Computer Science > Machine Learning

arXiv:1812.11446 (cs)

[Submitted on 29 Dec 2018 (v1), last revised 23 Apr 2019 (this version, v3)]

Title:Greedy Layerwise Learning Can Scale to ImageNet

Authors:Eugene Belilovsky, Michael Eickenberg, Edouard Oyallon

View PDF

Abstract:Shallow supervised 1-hidden layer neural networks have a number of favorable properties that make them easier to interpret, analyze, and optimize than their deep counterparts, but lack their representational power. Here we use 1-hidden layer learning problems to sequentially build deep networks layer by layer, which can inherit properties from shallow networks. Contrary to previous approaches using shallow networks, we focus on problems where deep learning is reported as critical for success. We thus study CNNs on image classification tasks using the large-scale ImageNet dataset and the CIFAR-10 dataset. Using a simple set of ideas for architecture and training we find that solving sequential 1-hidden-layer auxiliary problems lead to a CNN that exceeds AlexNet performance on ImageNet. Extending this training methodology to construct individual layers by solving 2-and-3-hidden layer auxiliary problems, we obtain an 11-layer network that exceeds several members of the VGG model family on ImageNet, and can train a VGG-11 model to the same accuracy as end-to-end learning. To our knowledge, this is the first competitive alternative to end-to-end training of CNNs that can scale to ImageNet. We illustrate several interesting properties of these models theoretically and conduct a range of experiments to study the properties this training induces on the intermediate layers.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1812.11446 [cs.LG]
	(or arXiv:1812.11446v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1812.11446

Submission history

From: Eugene Belilovsky [view email]
[v1] Sat, 29 Dec 2018 23:31:50 UTC (488 KB)
[v2] Tue, 29 Jan 2019 06:00:15 UTC (483 KB)
[v3] Tue, 23 Apr 2019 17:43:48 UTC (483 KB)

Computer Science > Machine Learning

Title:Greedy Layerwise Learning Can Scale to ImageNet

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Greedy Layerwise Learning Can Scale to ImageNet

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators