-
Heterocedasticity-Adjusted Ranking and Thresholding for Large-Scale Multiple Testing
Authors:
Luella Fu,
Bowen Gang,
Gareth M. James,
Wenguang Sun
Abstract:
Standardization has been a widely adopted practice in multiple testing, for it takes into account the variability in sampling and makes the test statistics comparable across different study units. However, despite conventional wisdom to the contrary, we show that there can be a significant loss in information from basing hypothesis tests on standardized statistics rather than the full data. We dev…
▽ More
Standardization has been a widely adopted practice in multiple testing, for it takes into account the variability in sampling and makes the test statistics comparable across different study units. However, despite conventional wisdom to the contrary, we show that there can be a significant loss in information from basing hypothesis tests on standardized statistics rather than the full data. We develop a new class of heteroscedasticity--adjusted ranking and thresholding (HART) rules that aim to improve existing methods by simultaneously exploiting commonalities and adjusting heterogeneities among the study units. The main idea of HART is to bypass standardization by directly incorporating both the summary statistic and its variance into the testing procedure. A key message is that the variance structure of the alternative distribution, which is subsumed under standardized statistics, is highly informative and can be exploited to achieve higher power. The proposed HART procedure is shown to be asymptotically valid and optimal for false discovery rate (FDR) control. Our simulation results demonstrate that HART achieves substantial power gain over existing methods at the same FDR level. We illustrate the implementation through a microarray analysis of myeloma.
△ Less
Submitted 5 March, 2020; v1 submitted 17 October, 2019;
originally announced October 2019.
-
Functional additive regression
Authors:
Yingying Fan,
Gareth M. James,
Peter Radchenko
Abstract:
We suggest a new method, called Functional Additive Regression, or FAR, for efficiently performing high-dimensional functional regression. FAR extends the usual linear regression model involving a functional predictor, $X(t)$, and a scalar response, $Y$, in two key respects. First, FAR uses a penalized least squares optimization approach to efficiently deal with high-dimensional problems involving…
▽ More
We suggest a new method, called Functional Additive Regression, or FAR, for efficiently performing high-dimensional functional regression. FAR extends the usual linear regression model involving a functional predictor, $X(t)$, and a scalar response, $Y$, in two key respects. First, FAR uses a penalized least squares optimization approach to efficiently deal with high-dimensional problems involving a large number of functional predictors. Second, FAR extends beyond the standard linear regression setting to fit general nonlinear additive models. We demonstrate that FAR can be implemented with a wide range of penalty functions using a highly efficient coordinate descent algorithm. Theoretical results are developed which provide motivation for the FAR optimization criterion. Finally, we show through simulations and two real data sets that FAR can significantly outperform competing methods.
△ Less
Submitted 14 October, 2015;
originally announced October 2015.
-
Functional linear regression that's interpretable
Authors:
Gareth M. James,
Jing Wang,
Ji Zhu
Abstract:
Regression models to relate a scalar $Y$ to a functional predictor $X(t)$ are becoming increasingly common. Work in this area has concentrated on estimating a coefficient function, $β(t)$, with $Y$ related to $X(t)$ through $\intβ(t)X(t) dt$. Regions where $β(t)\ne0$ correspond to places where there is a relationship between $X(t)$ and $Y$. Alternatively, points where $β(t)=0$ indicate no relati…
▽ More
Regression models to relate a scalar $Y$ to a functional predictor $X(t)$ are becoming increasingly common. Work in this area has concentrated on estimating a coefficient function, $β(t)$, with $Y$ related to $X(t)$ through $\intβ(t)X(t) dt$. Regions where $β(t)\ne0$ correspond to places where there is a relationship between $X(t)$ and $Y$. Alternatively, points where $β(t)=0$ indicate no relationship. Hence, for interpretation purposes, it is desirable for a regression procedure to be capable of producing estimates of $β(t)$ that are exactly zero over regions with no apparent relationship and have simple structures over the remaining regions. Unfortunately, most fitting procedures result in an estimate for $β(t)$ that is rarely exactly zero and has unnatural wiggles making the curve hard to interpret. In this article we introduce a new approach which uses variable selection ideas, applied to various derivatives of $β(t)$, to produce estimates that are both interpretable, flexible and accurate. We call our method "Functional Linear Regression That's Interpretable" (FLiRTI) and demonstrate it on simulated and real-world data sets. In addition, non-asymptotic theoretical bounds on the estimation error are presented. The bounds provide strong theoretical motivation for our approach.
△ Less
Submitted 20 August, 2009;
originally announced August 2009.