-
Sign tests for weak principal directions
Authors:
Davy Paindaveine,
Julien Remy,
Thomas Verdebout
Abstract:
We consider inference on the first principal direction of a $p$-variate elliptical distribution. We do so in challenging double asymptotic scenarios for which this direction eventually fails to be identifiable. In order to achieve robustness not only with respect to such weak identifiability but also with respect to heavy tails, we focus on sign-based statistical procedures, that is, on procedures…
▽ More
We consider inference on the first principal direction of a $p$-variate elliptical distribution. We do so in challenging double asymptotic scenarios for which this direction eventually fails to be identifiable. In order to achieve robustness not only with respect to such weak identifiability but also with respect to heavy tails, we focus on sign-based statistical procedures, that is, on procedures that involve the observations only through their direction from the center of the distribution. We actually consider the generic problem of testing the null hypothesis that the first principal direction coincides with a given direction of $\mathbb{R}^p$. We first focus on weak identifiability setups involving single spikes (that is, involving spectra for which the smallest eigenvalue has multiplicity $p-1$). We show that, irrespective of the degree of weak identifiability, such setups offer local alternatives for which the corresponding sequence of statistical experiments converges in the Le Cam sense. Interestingly, the limiting experiments depend on the degree of weak identifiability. We exploit this convergence result to build optimal sign tests for the problem considered. In classical asymptotic scenarios where the spectrum is fixed, these tests are shown to be asymptotically equivalent to the sign-based likelihood ratio tests available in the literature. Unlike the latter, however, the proposed sign tests are robust to arbitrarily weak identifiability. We show that our tests meet the asymptotic level constraint irrespective of the structure of the spectrum, hence also in possibly multi-spike setups. We fully characterize the non-null asymptotic distributions of the corresponding test statistics under weak identifiability, which allows us to quantify the corresponding local asymptotic powers.
△ Less
Submitted 29 August, 2019; v1 submitted 21 December, 2018;
originally announced December 2018.
-
Testing for Principal Component Directions under Weak Identifiability
Authors:
Davy Paindaveine,
Julien Remy,
Thomas Verdebout
Abstract:
We consider the problem of testing, on the basis of a $p$-variate Gaussian random sample, the null hypothesis ${\cal H}_0: {\pmb θ}_1= {\pmb θ}_1^0$ against the alternative ${\cal H}_1: {\pmb θ}_1 \neq {\pmb θ}_1^0$, where ${\pmb θ}_1$ is the "first" eigenvector of the underlying covariance matrix and ${\pmb θ}_1^0$ is a fixed unit $p$-vector. In the classical setup where eigenvalues…
▽ More
We consider the problem of testing, on the basis of a $p$-variate Gaussian random sample, the null hypothesis ${\cal H}_0: {\pmb θ}_1= {\pmb θ}_1^0$ against the alternative ${\cal H}_1: {\pmb θ}_1 \neq {\pmb θ}_1^0$, where ${\pmb θ}_1$ is the "first" eigenvector of the underlying covariance matrix and ${\pmb θ}_1^0$ is a fixed unit $p$-vector. In the classical setup where eigenvalues $λ_1>λ_2\geq \ldots\geq λ_p$ are fixed, the Anderson (1963) likelihood ratio test (LRT) and the Hallin, Paindaveine and Verdebout (2010) Le Cam optimal test for this problem are asymptotically equivalent under the null hypothesis, hence also under sequences of contiguous alternatives. We show that this equivalence does not survive asymptotic scenarios where $λ_{n1}/λ_{n2}=1+O(r_n)$ with $r_n=O(1/\sqrt{n})$. For such scenarios, the Le Cam optimal test still asymptotically meets the nominal level constraint, whereas the LRT severely overrejects the null hypothesis. Consequently, the former test should be favored over the latter one whenever the two largest sample eigenvalues are close to each other. By relying on the Le Cam's asymptotic theory of statistical experiments, we study the non-null and optimality properties of the Le Cam optimal test in the aforementioned asymptotic scenarios and show that the null robustness of this test is not obtained at the expense of power. Our asymptotic investigation is extensive in the sense that it allows $r_n$ to converge to zero at an arbitrary rate. While we restrict to single-spiked spectra of the form $λ_{n1}>λ_{n2}=\ldots=λ_{np}$ to make our results as striking as possible, we extend our results to the more general elliptical case. Finally, we present an illustrative real data example.
△ Less
Submitted 30 December, 2018; v1 submitted 15 October, 2017;
originally announced October 2017.
-
On the Number of Balanced Words of Given Length and Height over a Two-Letter Alphabet
Authors:
Nicolas Bedaride,
Eric Domenjoud,
Damien Jamet,
Jean-Luc Remy
Abstract:
We exhibit a recurrence on the number of discrete line segments joining two integer points in the plane using an encoding of such segments as balanced words of given length and height over the two-letter alphabet $\{0,1\}$. We give generating functions and study the asymptotic behaviour. As a particular case, we focus on the symmetrical discrete segments which are encoded by balanced palindromes.
We exhibit a recurrence on the number of discrete line segments joining two integer points in the plane using an encoding of such segments as balanced words of given length and height over the two-letter alphabet $\{0,1\}$. We give generating functions and study the asymptotic behaviour. As a particular case, we focus on the symmetrical discrete segments which are encoded by balanced palindromes.
△ Less
Submitted 29 July, 2010;
originally announced July 2010.