A one-sample location test based on weighted averaging of two test statistics in high-dimensional data
Authors:
Masashi Hyodo,
Takahiro Nishiyama
Abstract:
We discuss a one-sample location test that can be used in the case of high-dimensional data. For high-dimensional data, the power of Hotelling's test decrises when the dimension is close to the sample size. To address this loss of power, some non-exact approaches were proposed, e.g., Dempster (1958, 1960), Bai and Saranadasa (1996) and Srivastava and Du (2006). In this paper, we focus on Hotelling…
▽ More
We discuss a one-sample location test that can be used in the case of high-dimensional data. For high-dimensional data, the power of Hotelling's test decrises when the dimension is close to the sample size. To address this loss of power, some non-exact approaches were proposed, e.g., Dempster (1958, 1960), Bai and Saranadasa (1996) and Srivastava and Du (2006). In this paper, we focus on Hotelling's test and Dempster's test. The comparative merits and demerits of these two tests vary according to the local parameters. In particular, we consider the situation where it is difficult to determine which test should be used, that is, where the two tests are asymptotically equivalent in terms of local power. We propose a new statistic based on the weighted averaging of Hotelling's $T^2$ statistic and Dempster's statistic that can be applied in such a situation. Our weight is determined on the basis of the maximum local asymptotic power on a restricted parameter space that induces local asymptotic equivalence between Hotelling's test and Dempster's test. In addition, some good asymptotic properties with respect to the local power are shown. Numerical results show that our test is more stable than Hotelling's $T^2$ statistic and Dempster's statistic in most parameter settings.
△ Less
Submitted 9 May, 2014;
originally announced May 2014.
Asymptotic Properties of the Misclassification Errors for Euclidean Distance Discriminant Rule in High-Dimensional Data
Authors:
H. Watanabe,
M. Hyodo,
T. Seo,
T. Pavlenko
Abstract:
Performance accuracy of the Euclidean Distance Discriminant rule (EDDR) is studied in the high-dimensional asymptotic framework which allows the dimensionality to exceed sample size. Under mild assumptions on the traces of the covariance matrix, our new results provide the asymptotic distribution of the conditional misclassification error and the explicit expression for the consistent and asymptot…
▽ More
Performance accuracy of the Euclidean Distance Discriminant rule (EDDR) is studied in the high-dimensional asymptotic framework which allows the dimensionality to exceed sample size. Under mild assumptions on the traces of the covariance matrix, our new results provide the asymptotic distribution of the conditional misclassification error and the explicit expression for the consistent and asymptotically unbiased estimator of the expected misclassification error. To get these properties, new results on the asymptotic normality of the quadratic forms and traces of the higher power of Wishart matrix, are established. Using our asymptotic results, we further develop two generic methods of determining a cut-off point for EDDR to adjust the misclassification errors. Finally, we numerically justify the high accuracy of our asymptotic findings along with the cut-off determination methods in finite sample applications, inclusive of the large sample and high-dimensional scenarios.
△ Less
Submitted 3 March, 2014;
originally announced March 2014.