Two Phases of Scaling Laws for Nearest Neighbor Classifiers

Yang, Pengkun; Zhang, Jingzhao

Statistics > Machine Learning

arXiv:2308.08247v1 (stat)

[Submitted on 16 Aug 2023 (this version), latest version 3 Jun 2025 (v2)]

Title:Two Phases of Scaling Laws for Nearest Neighbor Classifiers

Authors:Pengkun Yang, Jingzhao Zhang

View PDF

Abstract:A scaling law refers to the observation that the test performance of a model improves as the number of training data increases. A fast scaling law implies that one can solve machine learning problems by simply boosting the data and the model sizes. Yet, in many cases, the benefit of adding more data can be negligible. In this work, we study the rate of scaling laws of nearest neighbor classifiers. We show that a scaling law can have two phases: in the first phase, the generalization error depends polynomially on the data dimension and decreases fast; whereas in the second phase, the error depends exponentially on the data dimension and decreases slowly. Our analysis highlights the complexity of the data distribution in determining the generalization error. When the data distributes benignly, our result suggests that nearest neighbor classifier can achieve a generalization error that depends polynomially, instead of exponentially, on the data dimension.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST)
Cite as:	arXiv:2308.08247 [stat.ML]
	(or arXiv:2308.08247v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2308.08247

Submission history

From: Jingzhao Zhang [view email]
[v1] Wed, 16 Aug 2023 09:28:55 UTC (9,679 KB)
[v2] Tue, 3 Jun 2025 07:05:41 UTC (1,172 KB)

Statistics > Machine Learning

Title:Two Phases of Scaling Laws for Nearest Neighbor Classifiers

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Two Phases of Scaling Laws for Nearest Neighbor Classifiers

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators