-
Nonparametric covariate-adjusted regression
Authors:
Aurore Delaigle,
Peter Hall,
Wen-Xin Zhou
Abstract:
We consider nonparametric estimation of a regression curve when the data are observed with multiplicative distortion which depends on an observed confounding variable. We suggest several estimators, ranging from a relatively simple one that relies on restrictive assumptions usually made in the literature, to a sophisticated piecewise approach that involves reconstructing a smooth curve from an est…
▽ More
We consider nonparametric estimation of a regression curve when the data are observed with multiplicative distortion which depends on an observed confounding variable. We suggest several estimators, ranging from a relatively simple one that relies on restrictive assumptions usually made in the literature, to a sophisticated piecewise approach that involves reconstructing a smooth curve from an estimator of a constant multiple of its absolute value, and which can be applied in much more general scenarios. We show that, although our nonparametric estimators are constructed from predictors of the unobserved undistorted data, they have the same first order asymptotic properties as the standard estimators that could be computed if the undistorted data were available. We illustrate the good numerical performance of our methods on both simulated and real datasets.
△ Less
Submitted 12 January, 2016;
originally announced January 2016.
-
Kernel methods and minimum contrast estimators for empirical deconvolution
Authors:
Aurore Delaigle,
Peter Hall
Abstract:
We survey classical kernel methods for providing nonparametric solutions to problems involving measurement error. In particular we outline kernel-based methodology in this setting, and discuss its basic properties. Then we point to close connections that exist between kernel methods and much newer approaches based on minimum contrast techniques. The connections are through use of the sinc kernel…
▽ More
We survey classical kernel methods for providing nonparametric solutions to problems involving measurement error. In particular we outline kernel-based methodology in this setting, and discuss its basic properties. Then we point to close connections that exist between kernel methods and much newer approaches based on minimum contrast techniques. The connections are through use of the sinc kernel for kernel-based inference. This `infinite order' kernel is not often used explicitly for kernel-based deconvolution, although it has received attention in more conventional problems where measurement error is not an issue. We show that in a comparison between kernel methods for density deconvolution, and their counterparts based on minimum contrast, the two approaches give identical results on a grid which becomes increasingly fine as the bandwidth decreases. In consequence, the main numerical differences between these two techniques are arguably the result of different approaches to choosing smoothing parameters.
△ Less
Submitted 1 March, 2010;
originally announced March 2010.
-
Robustness and accuracy of methods for high dimensional data analysis based on Student's t statistic
Authors:
Aurore Delaigle,
Peter Hall,
Jiashun Jin
Abstract:
Student's $t$ statistic is finding applications today that were never envisaged when it was introduced more than a century ago. Many of these applications rely on properties, for example robustness against heavy tailed sampling distributions, that were not explicitly considered until relatively recently. In this paper we explore these features of the $t$ statistic in the context of its applicati…
▽ More
Student's $t$ statistic is finding applications today that were never envisaged when it was introduced more than a century ago. Many of these applications rely on properties, for example robustness against heavy tailed sampling distributions, that were not explicitly considered until relatively recently. In this paper we explore these features of the $t$ statistic in the context of its application to very high dimensional problems, including feature selection and ranking, highly multiple hypothesis testing, and sparse, high dimensional signal detection. Robustness properties of the $t$-ratio are highlighted, and it is established that those properties are preserved under applications of the bootstrap. In particular, bootstrap methods correct for skewness, and therefore lead to second-order accuracy, even in the extreme tails. Indeed, it is shown that the bootstrap, and also the more popular but less accurate $t$-distribution and normal approximations, are more effective in the tails than towards the middle of the distribution. These properties motivate new methods, for example bootstrap-based techniques for signal detection, that confine attention to the significant tail of a statistic.
△ Less
Submitted 21 January, 2010;
originally announced January 2010.
-
Weighted least squares methods for prediction in the functional data linear model
Authors:
Aurore Delaigle,
Peter Hall,
Tatiyana V. Apanasovich
Abstract:
The problem of prediction in functional linear regression is conventionally addressed by reducing dimension via the standard principal component basis. In this paper we show that an alternative basis chosen through weighted least-squares, or weighted least-squares itself, can be more effective when the experimental errors are heteroscedastic. We give a concise theoretical result which demonstrat…
▽ More
The problem of prediction in functional linear regression is conventionally addressed by reducing dimension via the standard principal component basis. In this paper we show that an alternative basis chosen through weighted least-squares, or weighted least-squares itself, can be more effective when the experimental errors are heteroscedastic. We give a concise theoretical result which demonstrates the effectiveness of this approach, even when the model for the variance is inaccurate, and we explore the numerical properties of the method. We show too that the advantages of the suggested adaptive techniques are not found only in low-dimensional aspects of the problem; rather, they accrue almost equally among all dimensions.
△ Less
Submitted 19 February, 2009;
originally announced February 2009.