Minimax Supervised Clustering in the Anisotropic Gaussian Mixture Model: A new take on Robust Interpolation

Minsker, Stanislav; Ndaoud, Mohamed; Shen, Yiqiu

Mathematics > Statistics Theory

arXiv:2111.07041 (math)

[Submitted on 13 Nov 2021]

Title:Minimax Supervised Clustering in the Anisotropic Gaussian Mixture Model: A new take on Robust Interpolation

Authors:Stanislav Minsker, Mohamed Ndaoud, Yiqiu Shen

View PDF

Abstract:We study the supervised clustering problem under the two-component anisotropic Gaussian mixture model in high dimensions and in the non-asymptotic setting. We first derive a lower and a matching upper bound for the minimax risk of clustering in this framework. We also show that in the high-dimensional regime, the linear discriminant analysis (LDA) classifier turns out to be sub-optimal in the minimax sense. Next, we characterize precisely the risk of $\ell_2$-regularized supervised least squares classifiers. We deduce the fact that the interpolating solution may outperform the regularized classifier, under mild assumptions on the covariance structure of the noise. Our analysis also shows that interpolation can be robust to corruption in the covariance of the noise when the signal is aligned with the "clean" part of the covariance, for the properly defined notion of alignment. To the best of our knowledge, this peculiar phenomenon has not yet been investigated in the rapidly growing literature related to interpolation. We conclude that interpolation is not only benign but can also be optimal, and in some cases robust.

Subjects:	Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:2111.07041 [math.ST]
	(or arXiv:2111.07041v1 [math.ST] for this version)
	https://doi.org/10.48550/arXiv.2111.07041

Submission history

From: Mohamed Ndaoud [view email]
[v1] Sat, 13 Nov 2021 05:19:37 UTC (348 KB)

Mathematics > Statistics Theory

Title:Minimax Supervised Clustering in the Anisotropic Gaussian Mixture Model: A new take on Robust Interpolation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Statistics Theory

Title:Minimax Supervised Clustering in the Anisotropic Gaussian Mixture Model: A new take on Robust Interpolation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators