-
Robust Density Estimation under Besov IPM Losses
Authors:
Ananya Uppal,
Shashank Singh,
Barnabas Poczos
Abstract:
We study minimax convergence rates of nonparametric density estimation in the Huber contamination model, in which a proportion of the data comes from an unknown outlier distribution. We provide the first results for this problem under a large family of losses, called Besov integral probability metrics (IPMs), that includes $\mathcal{L}^p$, Wasserstein, Kolmogorov-Smirnov, and other common distance…
▽ More
We study minimax convergence rates of nonparametric density estimation in the Huber contamination model, in which a proportion of the data comes from an unknown outlier distribution. We provide the first results for this problem under a large family of losses, called Besov integral probability metrics (IPMs), that includes $\mathcal{L}^p$, Wasserstein, Kolmogorov-Smirnov, and other common distances between probability distributions. Specifically, under a range of smoothness assumptions on the population and outlier distributions, we show that a re-scaled thresholding wavelet series estimator achieves minimax optimal convergence rates under a wide variety of losses. Finally, based on connections that have recently been shown between nonparametric density estimation under IPM losses and generative adversarial networks (GANs), we show that certain GAN architectures also achieve these minimax rates.
△ Less
Submitted 6 September, 2021; v1 submitted 18 April, 2020;
originally announced April 2020.
-
Nonparametric Density Estimation & Convergence Rates for GANs under Besov IPM Losses
Authors:
Ananya Uppal,
Shashank Singh,
Barnabás Póczos
Abstract:
We study the problem of estimating a nonparametric probability density under a large family of losses called Besov IPMs, which include, for example, $\mathcal{L}^p$ distances, total variation distance, and generalizations of both Wasserstein and Kolmogorov-Smirnov distances. For a wide variety of settings, we provide both lower and upper bounds, identifying precisely how the choice of loss functio…
▽ More
We study the problem of estimating a nonparametric probability density under a large family of losses called Besov IPMs, which include, for example, $\mathcal{L}^p$ distances, total variation distance, and generalizations of both Wasserstein and Kolmogorov-Smirnov distances. For a wide variety of settings, we provide both lower and upper bounds, identifying precisely how the choice of loss function and assumptions on the data interact to determine the minimax optimal convergence rate. We also show that linear distribution estimates, such as the empirical distribution or kernel density estimator, often fail to converge at the optimal rate. Our bounds generalize, unify, or improve several recent and classical results. Moreover, IPMs can be used to formalize a statistical model of generative adversarial networks (GANs). Thus, we show how our results imply bounds on the statistical error of a GAN, showing, for example, that GANs can strictly outperform the best linear estimator.
△ Less
Submitted 13 January, 2020; v1 submitted 9 February, 2019;
originally announced February 2019.
-
Nonparametric Density Estimation under Adversarial Losses
Authors:
Shashank Singh,
Ananya Uppal,
Boyue Li,
Chun-Liang Li,
Manzil Zaheer,
Barnabás Póczos
Abstract:
We study minimax convergence rates of nonparametric density estimation under a large class of loss functions called "adversarial losses", which, besides classical $\mathcal{L}^p$ losses, includes maximum mean discrepancy (MMD), Wasserstein distance, and total variation distance. These losses are closely related to the losses encoded by discriminator networks in generative adversarial networks (GAN…
▽ More
We study minimax convergence rates of nonparametric density estimation under a large class of loss functions called "adversarial losses", which, besides classical $\mathcal{L}^p$ losses, includes maximum mean discrepancy (MMD), Wasserstein distance, and total variation distance. These losses are closely related to the losses encoded by discriminator networks in generative adversarial networks (GANs). In a general framework, we study how the choice of loss and the assumed smoothness of the underlying density together determine the minimax rate. We also discuss implications for training GANs based on deep ReLU networks, and more general connections to learning implicit generative models in a minimax statistical sense.
△ Less
Submitted 28 October, 2018; v1 submitted 22 May, 2018;
originally announced May 2018.
-
Spacing Distribution of a Bernoulli Sampled Sequence
Authors:
Abigail L. Turner,
Ananya Uppal,
Peng Xu
Abstract:
We investigate the spacing distribution of sequence \[S_n=\left\{0,\frac{1}{n},\frac{2}{n},\dots,\frac{n-1}{n},1\right\}\] after Bernoulli sampling. We describe the closed form expression of the probability mass function of the spacings, and show that the spacings converge in distribution to a geometric random variable.
We investigate the spacing distribution of sequence \[S_n=\left\{0,\frac{1}{n},\frac{2}{n},\dots,\frac{n-1}{n},1\right\}\] after Bernoulli sampling. We describe the closed form expression of the probability mass function of the spacings, and show that the spacings converge in distribution to a geometric random variable.
△ Less
Submitted 12 October, 2015;
originally announced October 2015.