On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models

Ju, Peizhong; Lin, Xiaojun; Shroff, Ness B.

Computer Science > Machine Learning

arXiv:2103.05243 (cs)

[Submitted on 9 Mar 2021 (v1), last revised 7 Mar 2023 (this version, v3)]

Title:On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models

Authors:Peizhong Ju, Xiaojun Lin, Ness B. Shroff

View PDF

Abstract:In this paper, we study the generalization performance of min $\ell_2$-norm overfitting solutions for the neural tangent kernel (NTK) model of a two-layer neural network with ReLU activation that has no bias term. We show that, depending on the ground-truth function, the test error of overfitted NTK models exhibits characteristics that are different from the "double-descent" of other overparameterized linear models with simple Fourier or Gaussian features. Specifically, for a class of learnable functions, we provide a new upper bound of the generalization error that approaches a small limiting value, even when the number of neurons $p$ approaches infinity. This limiting value further decreases with the number of training samples $n$. For functions outside of this class, we provide a lower bound on the generalization error that does not diminish to zero even when $n$ and $p$ are both large.

Comments:	Published in ICML21. This version fixes an error of Lemma 31 and other parts affected by this error. The main results remain the same except some small changes on certain coefficients of Eq.(9)
Subjects:	Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:2103.05243 [cs.LG]
	(or arXiv:2103.05243v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2103.05243

Submission history

From: Peizhong Ju [view email]
[v1] Tue, 9 Mar 2021 06:24:59 UTC (457 KB)
[v2] Tue, 24 Aug 2021 05:18:52 UTC (607 KB)
[v3] Tue, 7 Mar 2023 20:54:33 UTC (621 KB)

Full-text links:

Access Paper:

view license

Current browse context:

< prev | next >

new | recent | 2021-03

Change to browse by:

cs.LG
math
math.ST
stat
stat.ML
stat.TH

References & Citations

DBLP - CS Bibliography

listing | bibtex

Xiaojun Lin
Ness B. Shroff

export BibTeX citation

Computer Science > Machine Learning

Title:On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators