-
On the robustness of semi-discrete optimal transport
Authors:
Davy Paindaveine,
Riccardo Passeggeri
Abstract:
We derive the breakdown point for solutions of semi-discrete optimal transport problems, which characterizes the robustness of the multivariate quantiles based on optimal transport proposed in Ghosal and Sen (2022). We do so under very mild assumptions: the absolutely continuous reference measure is only assumed to have a support that is compact and convex, whereas the target measure is a general…
▽ More
We derive the breakdown point for solutions of semi-discrete optimal transport problems, which characterizes the robustness of the multivariate quantiles based on optimal transport proposed in Ghosal and Sen (2022). We do so under very mild assumptions: the absolutely continuous reference measure is only assumed to have a support that is compact and convex, whereas the target measure is a general discrete measure on a finite number, $n$ say, of atoms. The breakdown point depends on the target measure only through its probability weights (hence not on the location of the atoms) and involves the geometry of the reference measure through the Tukey (1975) concept of halfspace depth. Remarkably, depending on this geometry, the breakdown point of the optimal transport median can be strictly smaller than the breakdown point of the univariate median or the breakdown point of the spatial median, namely~$\lceil n/2\rceil /2$. In the context of robust location estimation, our results provide a subtle insight on how to perform multivariate trimming when constructing trimmed means based on optimal transport.
△ Less
Submitted 25 October, 2024;
originally announced October 2024.
-
On optimal tests for rotational symmetry against new classes of hyperspherical distributions
Authors:
Eduardo García-Portugués,
Davy Paindaveine,
Thomas Verdebout
Abstract:
Motivated by the central role played by rotationally symmetric distributions in directional statistics, we consider the problem of testing rotational symmetry on the hypersphere. We adopt a semiparametric approach and tackle problems where the location of the symmetry axis is either specified or unspecified. For each problem, we define two tests and study their asymptotic properties under very mil…
▽ More
Motivated by the central role played by rotationally symmetric distributions in directional statistics, we consider the problem of testing rotational symmetry on the hypersphere. We adopt a semiparametric approach and tackle problems where the location of the symmetry axis is either specified or unspecified. For each problem, we define two tests and study their asymptotic properties under very mild conditions. We introduce two new classes of directional distributions that extend the rotationally symmetric class and are of independent interest. We prove that each test is locally asymptotically maximin, in the Le Cam sense, for one kind of the alternatives given by the new classes of distributions, both for specified and unspecified symmetry axis. The tests, aimed to detect location-like and scatter-like alternatives, are combined into convenient hybrid tests that are consistent against both alternatives. We perform Monte Carlo experiments that illustrate the finite-sample performances of the proposed tests and their agreement with the asymptotic results. Finally, the practical relevance of our tests is illustrated on a real data application from astronomy. The R package rotasym implements the proposed tests and allows practitioners to reproduce the data application.
△ Less
Submitted 21 September, 2020; v1 submitted 15 June, 2017;
originally announced June 2017.
-
Conditional quantile estimation through optimal quantization
Authors:
Isabelle Charlier,
Davy Paindaveine,
Jérôme Saracco
Abstract:
In this paper, we use quantization to construct a nonparametric estimator of conditional quantiles of a scalar response $Y$ given a d-dimensional vector of covariates $X$. First we focus on the population level and show how optimal quantization of $X$, which consists in discretizing $X$ by projecting it on an appropriate grid of $N$ points, allows to approximate conditional quantiles of $Y$ given…
▽ More
In this paper, we use quantization to construct a nonparametric estimator of conditional quantiles of a scalar response $Y$ given a d-dimensional vector of covariates $X$. First we focus on the population level and show how optimal quantization of $X$, which consists in discretizing $X$ by projecting it on an appropriate grid of $N$ points, allows to approximate conditional quantiles of $Y$ given $X$. We show that this is approximation is arbitrarily good as $N$ goes to infinity and provide a rate of convergence for the approximation error. Then we turn to the sample case and define an estimator of conditional quantiles based on quantization ideas. We prove that this estimator is consistent for its fixed-$N$ population counterpart. The results are illustrated on a numerical example. Dominance of our estimators over local constant/linear ones and nearest neighbor ones is demonstrated through extensive simulations in the companion paper Charlier et al.(2014b).
△ Less
Submitted 12 May, 2014;
originally announced May 2014.
-
Probit transformation for nonparametric kernel estimation of the copula density
Authors:
Gery Geenens,
Arthur Charpentier,
Davy Paindaveine
Abstract:
Copula modelling has become ubiquitous in modern statistics. Here, the problem of nonparametrically estimating a copula density is addressed. Arguably the most popular nonparametric density estimator, the kernel estimator is not suitable for the unit-square-supported copula densities, mainly because it is heavily affected by boundary bias issues. In addition, most common copulas admit unbounded de…
▽ More
Copula modelling has become ubiquitous in modern statistics. Here, the problem of nonparametrically estimating a copula density is addressed. Arguably the most popular nonparametric density estimator, the kernel estimator is not suitable for the unit-square-supported copula densities, mainly because it is heavily affected by boundary bias issues. In addition, most common copulas admit unbounded densities, and kernel methods are not consistent in that case. In this paper, a kernel-type copula density estimator is proposed. It is based on the idea of transforming the uniform marginals of the copula density into normal distributions via the probit function, estimating the density in the transformed domain, which can be accomplished without boundary problems, and obtaining an estimate of the copula density through back-transformation. Although natural, a raw application of this procedure was, however, seen not to perform very well in the earlier literature. Here, it is shown that, if combined with local likelihood density estimation methods, the idea yields very good and easy to implement estimators, fixing boundary issues in a natural way and able to cope with unbounded copula densities. The asymptotic properties of the suggested estimators are derived, and a practical way of selecting the crucially important smoothing parameters is devised. Finally, extensive simulation studies and a real data analysis evidence their excellent performance compared to their main competitors.
△ Less
Submitted 16 April, 2014;
originally announced April 2014.
-
Depth-based Runs Tests for Bivariate Central Symmetry
Authors:
Rainer Dyckerhoff,
Christophe Ley,
Davy Paindaveine
Abstract:
McWilliams (1990) introduced a nonparametric procedure based on runs for the problem of testing univariate symmetry about the origin (equivalently, about an arbitrary specified center). His procedure first reorders the observations according to their absolute values, then rejects the null when the number of runs in the resulting series of signs is too small. This test is universally consistent and…
▽ More
McWilliams (1990) introduced a nonparametric procedure based on runs for the problem of testing univariate symmetry about the origin (equivalently, about an arbitrary specified center). His procedure first reorders the observations according to their absolute values, then rejects the null when the number of runs in the resulting series of signs is too small. This test is universally consistent and enjoys nice robustness properties, but is unfortunately limited to the univariate setup. In this paper, we extend McWilliams' procedure into tests of bivariate central symmetry. The proposed tests first reorder the observations according to their statistical depth in a symmetrized version of the sample, then reject the null when an original concept of simplicial runs is too small. Our tests are affine-invariant and have good robustness properties. In particular, they do not require any finite moment assumption. We derive their limiting null distribution, which establishes their asymptotic distribution-freeness. We study their finite-sample properties through Monte Carlo experiments, and conclude with some final comments.
△ Less
Submitted 10 January, 2014;
originally announced January 2014.