Faster Gradient-based NAS Pipeline Combining Broad Scalable Architecture with Confident Learning Rate

Zixiang, Ding; Yaran, Chen; Nannan, Li; Dongbin, Zhao

Computer Science > Computer Vision and Pattern Recognition

arXiv:2009.08886v1 (cs)

[Submitted on 18 Sep 2020 (this version), latest version 25 Jan 2021 (v4)]

Title:Faster Gradient-based NAS Pipeline Combining Broad Scalable Architecture with Confident Learning Rate

Authors:Ding Zixiang, Chen Yaran, Li Nannan, Zhao Dongbin

View PDF

Abstract:In order to further improve the search efficiency of Neural Architecture Search (NAS), we propose B-DARTS, a novel pipeline combining broad scalable architecture with Confident Learning Rate (CLR). In B-DARTS, Broad Convolutional Neural Network (BCNN) is employed as the scalable architecture for DARTS, a popular differentiable NAS approach. On one hand, BCNN is a broad scalable architecture whose topology achieves two advantages compared with the deep one, mainly including faster single-step training speed and higher memory efficiency (i.e. larger batch size for architecture search), which are all contributed to the search efficiency improvement of NAS. On the other hand, DARTS discovers the optimal architecture by gradient-based optimization algorithm, which benefits from two superiorities of BCNN simultaneously. Similar to vanilla DARTS, B-DARTS also suffers from the performance collapse issue, where those weight-free operations are prone to be selected by the search strategy. Therefore, we propose CLR, that considers the confidence of gradient for architecture weights update increasing with the training time of over-parameterized model, to mitigate the above issue. Experimental results on CIFAR-10 and ImageNet show that 1) B-DARTS delivers state-of-the-art efficiency of 0.09 GPU day using first order approximation on CIFAR-10; 2) the learned architecture by B-DARTS achieves competitive performance using state-of-the-art composite multiply-accumulate operations and parameters on ImageNet; and 3) the proposed CLR is effective for performance collapse issue alleviation of both B-DARTS and DARTS.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as:	arXiv:2009.08886 [cs.CV]
	(or arXiv:2009.08886v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2009.08886

Submission history

From: Zixiang Ding [view email]
[v1] Fri, 18 Sep 2020 15:25:08 UTC (3,324 KB)
[v2] Mon, 21 Sep 2020 07:09:31 UTC (3,324 KB)
[v3] Thu, 21 Jan 2021 02:59:21 UTC (6,237 KB)
[v4] Mon, 25 Jan 2021 09:05:02 UTC (6,254 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Faster Gradient-based NAS Pipeline Combining Broad Scalable Architecture with Confident Learning Rate

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Faster Gradient-based NAS Pipeline Combining Broad Scalable Architecture with Confident Learning Rate

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators