Kernel Thinning

Dwivedi, Raaz; Mackey, Lester

Statistics > Machine Learning

arXiv:2105.05842 (stat)

[Submitted on 12 May 2021 (v1), last revised 12 May 2024 (this version, v11)]

Title:Kernel Thinning

Authors:Raaz Dwivedi, Lester Mackey

View PDF HTML (experimental)

Abstract:We introduce kernel thinning, a new procedure for compressing a distribution $\mathbb{P}$ more effectively than i.i.d. sampling or standard thinning. Given a suitable reproducing kernel $\mathbf{k}_{\star}$ and $O(n^2)$ time, kernel thinning compresses an $n$-point approximation to $\mathbb{P}$ into a $\sqrt{n}$-point approximation with comparable worst-case integration error across the associated reproducing kernel Hilbert space. The maximum discrepancy in integration error is $O_d(n^{-1/2}\sqrt{\log n})$ in probability for compactly supported $\mathbb{P}$ and $O_d(n^{-\frac{1}{2}} (\log n)^{(d+1)/2}\sqrt{\log\log n})$ for sub-exponential $\mathbb{P}$ on $\mathbb{R}^d$. In contrast, an equal-sized i.i.d. sample from $\mathbb{P}$ suffers $\Omega(n^{-1/4})$ integration error. Our sub-exponential guarantees resemble the classical quasi-Monte Carlo error rates for uniform $\mathbb{P}$ on $[0,1]^d$ but apply to general distributions on $\mathbb{R}^d$ and a wide range of common kernels. Moreover, the same construction delivers near-optimal $L^\infty$ coresets in $O(n^2)$ time. We use our results to derive explicit non-asymptotic maximum mean discrepancy bounds for Gaussian, Matérn, and B-spline kernels and present two vignettes illustrating the practical benefits of kernel thinning over i.i.d. sampling and standard Markov chain Monte Carlo thinning, in dimensions $d=2$ through $100$.

Comments:	Accepted for presentation as an extended abstract at the Conference on Learning Theory (COLT) 2021, and published in the Journal of Machine Learning Research (JMLR) 2024
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Statistics Theory (math.ST); Computation (stat.CO); Methodology (stat.ME)
Cite as:	arXiv:2105.05842 [stat.ML]
	(or arXiv:2105.05842v11 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2105.05842

Submission history

From: Raaz Dwivedi [view email]
[v1] Wed, 12 May 2021 17:56:42 UTC (479 KB)
[v2] Mon, 17 May 2021 17:59:23 UTC (480 KB)
[v3] Thu, 1 Jul 2021 20:57:55 UTC (482 KB)
[v4] Sun, 1 Aug 2021 22:44:15 UTC (762 KB)
[v5] Fri, 8 Oct 2021 03:17:47 UTC (763 KB)
[v6] Sat, 13 Nov 2021 02:14:18 UTC (618 KB)
[v7] Wed, 13 Apr 2022 22:59:44 UTC (1,133 KB)
[v8] Tue, 19 Jul 2022 19:17:15 UTC (1,133 KB)
[v9] Wed, 7 Jun 2023 00:41:12 UTC (1,150 KB)
[v10] Fri, 9 Jun 2023 00:45:40 UTC (1,149 KB)
[v11] Sun, 12 May 2024 00:15:31 UTC (1,144 KB)

Statistics > Machine Learning

Title:Kernel Thinning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Kernel Thinning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators