-
The e-Partitioning Principle of False Discovery Rate Control
Authors:
Jelle Goeman,
Rianne de Heide,
Aldo Solari
Abstract:
We present a novel necessary and sufficient principle for False Discovery Rate (FDR) control. This e-Partitioning Principle says that a procedure controls FDR if and only if it is a special case of a general e-Partitioning procedure. By writing existing methods as special cases of this procedure, we can achieve uniform improvements of these methods, and we show this in particular for the eBH, BY a…
▽ More
We present a novel necessary and sufficient principle for False Discovery Rate (FDR) control. This e-Partitioning Principle says that a procedure controls FDR if and only if it is a special case of a general e-Partitioning procedure. By writing existing methods as special cases of this procedure, we can achieve uniform improvements of these methods, and we show this in particular for the eBH, BY and Su methods. We also show that methods developed using the $e$-Partitioning Principle have several valuable properties. They generally control FDR not just for one rejected set, but simultaneously over many, allowing post hoc flexibility for the researcher in the final choice of the rejected hypotheses. Under some conditions, they also allow for post hoc adjustment of the error rate, choosing the FDR level $α$ post hoc, or switching to familywise error control after seeing the data. In addition, e-Partitioning allows FDR control methods to exploit logical relationships between hypotheses to gain power.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
Bad estimation, good prediction: the Lasso in dense regimes
Authors:
Andrea Bratsberg,
Magne Thoresen,
Jelle J. Goeman
Abstract:
For high-dimensional omics data, sparsity-inducing regularization methods such as the Lasso are widely used and often yield strong predictive performance, even in settings when the assumption of sparsity is likely violated. We demonstrate that under a specific dense model, namely the high-dimensional joint latent variable model, the Lasso produces sparse prediction rules with favorable prediction…
▽ More
For high-dimensional omics data, sparsity-inducing regularization methods such as the Lasso are widely used and often yield strong predictive performance, even in settings when the assumption of sparsity is likely violated. We demonstrate that under a specific dense model, namely the high-dimensional joint latent variable model, the Lasso produces sparse prediction rules with favorable prediction error bounds, even when the underlying regression coefficient vector is not sparse at all. We further argue that this model better represents many types of omics data than sparse linear regression models. We prove that the prediction bound under this model in fact decreases with increasing number of predictors, and confirm this through simulation examples. These results highlight the need for caution when interpreting sparse prediction rules, as strong prediction accuracy of a sparse prediction rule may not imply underlying biological significance of the individual predictors.
△ Less
Submitted 11 February, 2025;
originally announced February 2025.
-
Carefree multiple testing with e-processes
Authors:
Yury Tavyrikov,
Jelle J. Goeman,
Rianne de Heide
Abstract:
E-processes enable hypothesis testing with ongoing data collection while maintaining Type I error control. However, when testing multiple hypotheses simultaneously, current $e$-value based multiple testing methods such as e-BH are not invariant to the order in which data are gathered for the different $e$-processes. This can lead to undesirable situations, e.g., where a hypothesis rejected at time…
▽ More
E-processes enable hypothesis testing with ongoing data collection while maintaining Type I error control. However, when testing multiple hypotheses simultaneously, current $e$-value based multiple testing methods such as e-BH are not invariant to the order in which data are gathered for the different $e$-processes. This can lead to undesirable situations, e.g., where a hypothesis rejected at time $t$ is no longer rejected at time $t+1$ after choosing to gather more data for one or more $e$-processes unrelated to that hypothesis. We argue that multiple testing methods should always work with suprema of $e$-processes. We provide an example to illustrate that e-BH does not control this FDR at level $α$ when applied to suprema of $e$-processes. We show that adjusters can be used to ensure FDR-sup control with e-BH under arbitrary dependence.
△ Less
Submitted 31 January, 2025;
originally announced January 2025.
-
A generalized distance covariance framework for genome-wide association studies
Authors:
Dominic Edelmann,
Fernando Castro-Prado,
Jelle J. Goeman
Abstract:
When testing for the association of a single SNP with a phenotypic response, one usually considers an additive genetic model, assuming that the mean of of the response for the heterozygous state is the average of the means for the two homozygous states. However, this simplification often does not hold. In this paper, we present a novel framework for testing the association of a single SNP and a ph…
▽ More
When testing for the association of a single SNP with a phenotypic response, one usually considers an additive genetic model, assuming that the mean of of the response for the heterozygous state is the average of the means for the two homozygous states. However, this simplification often does not hold. In this paper, we present a novel framework for testing the association of a single SNP and a phenotype. Different from the predominant standard approach, our methodology is guaranteed to detect all dependencies expressed by classical genetic association models. The asymptotic distribution under mild regularity assumptions is derived. Moreover, the finite sample distribution under Gaussianity is provided in which the exact p-value can be efficiently evaluated via the classical Appell hypergeometric series. Both results are extended to a regression-type setting with nuisance covariates, enabling hypotheses testing in a wide range of scenarios. A connection of our approach to score tests is explored, leading to intuitive interpretations as locally most powerful tests. A simulation study demonstrates the computational efficiency and excellent statistical performance of the proposed methodology. A real data example is provided.
△ Less
Submitted 4 January, 2025;
originally announced January 2025.
-
Variable selection via fused sparse-group lasso penalized multi-state models incorporating molecular data
Authors:
Kaya Miah,
Jelle J. Goeman,
Hein Putter,
Annette Kopp-Schneider,
Axel Benner
Abstract:
In multi-state models based on high-dimensional data, effective modeling strategies are required to determine an optimal, ideally parsimonious model. In particular, linking covariate effects across transitions is needed to conduct joint variable selection. A useful technique to reduce model complexity is to address homogeneous covariate effects for distinct transitions. We integrate this approach…
▽ More
In multi-state models based on high-dimensional data, effective modeling strategies are required to determine an optimal, ideally parsimonious model. In particular, linking covariate effects across transitions is needed to conduct joint variable selection. A useful technique to reduce model complexity is to address homogeneous covariate effects for distinct transitions. We integrate this approach to data-driven variable selection by extended regularization methods within multi-state model building. We propose the fused sparse-group lasso (FSGL) penalized Cox-type regression in the framework of multi-state models combining the penalization concepts of pairwise differences of covariate effects along with transition grouping. For optimization, we adapt the alternating direction method of multipliers (ADMM) algorithm to transition-specific hazards regression in the multi-state setting. In a simulation study and application to acute myeloid leukemia (AML) data, we evaluate the algorithm's ability to select a sparse model incorporating relevant transition-specific effects and similar cross-transition effects. We investigate settings in which the combined penalty is beneficial compared to global lasso regularization.
△ Less
Submitted 26 November, 2024;
originally announced November 2024.
-
OCEAN: Flexible Feature Set Aggregation for Analysis of Multi-omics Data
Authors:
Mitra Ebrahimpoor,
Renee Menezes,
Ningning Xu,
Jelle J. Goeman
Abstract:
Integrated analysis of multi-omics datasets holds great promise for uncovering complex biological processes. However, the large dimension of omics data poses significant interpretability and multiple testing challenges. Simultaneous Enrichment Analysis (SEA) was introduced to address these issues in single-omics analysis, providing an in-built multiple testing correction and enabling simultaneous…
▽ More
Integrated analysis of multi-omics datasets holds great promise for uncovering complex biological processes. However, the large dimension of omics data poses significant interpretability and multiple testing challenges. Simultaneous Enrichment Analysis (SEA) was introduced to address these issues in single-omics analysis, providing an in-built multiple testing correction and enabling simultaneous feature set testing. In this paper, we introduce OCEAN, an extension of SEA to multi-omics data. OCEAN is a flexible approach to analyze potentially all possible two-way feature sets from any pair of genomics datasets. We also propose two new error rates which are in line with the two-way structure of the data and facilitate interpretation of the results. The power and utility of OCEAN is demonstrated by analyzing copy number and gene expression data for breast and colon cancer.
△ Less
Submitted 25 October, 2024;
originally announced October 2024.
-
Permutation-based multiple testing when fitting many generalized linear models
Authors:
Riccardo De Santis,
Jelle J. Goeman,
Samuel Davenport,
Jesse Hemerik,
Livio Finos
Abstract:
In many applied sciences a popular analysis strategy for high-dimensional data is to fit many multivariate generalized linear models in parallel. This paper presents a novel approach to address the resulting multiple testing problem by combining a recently developed sign-flip test with permutation-based multiple-testing procedures. Our method builds upon the univariate standardized flip-scores tes…
▽ More
In many applied sciences a popular analysis strategy for high-dimensional data is to fit many multivariate generalized linear models in parallel. This paper presents a novel approach to address the resulting multiple testing problem by combining a recently developed sign-flip test with permutation-based multiple-testing procedures. Our method builds upon the univariate standardized flip-scores test which offers robustness against misspecified variances in generalized linear models, a crucial feature in high-dimensional settings where comprehensive model validation is particularly challenging. We extend this approach to the multivariate setting, enabling adaptation to unknown response correlation structures. This approach yields relevant power improvements over conventional multiple testing methods when correlation is present.
△ Less
Submitted 4 October, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
Robust Inference for Generalized Linear Mixed Models: An Approach Based on Score Sign Flipping
Authors:
Angela Andreella,
Jelle Goeman,
Jesse Hemerik,
Livio Finos
Abstract:
Despite the versatility of generalized linear mixed models in handling complex experimental designs, they often suffer from misspecification and convergence problems. This makes inference on the values of coefficients problematic. To address these challenges, we propose a robust extension of the score-based statistical test using sign-flipping transformations. Our approach efficiently handles with…
▽ More
Despite the versatility of generalized linear mixed models in handling complex experimental designs, they often suffer from misspecification and convergence problems. This makes inference on the values of coefficients problematic. To address these challenges, we propose a robust extension of the score-based statistical test using sign-flipping transformations. Our approach efficiently handles within-variance structure and heteroscedasticity, ensuring accurate regression coefficient testing. The approach is illustrated by analyzing the reduction of health issues over time for newly adopted children. The model is characterized by a binomial response with unbalanced frequencies and several categorical and continuous predictors. The proposed approach efficiently deals with critical problems related to longitudinal nonlinear models, surpassing common statistical approaches such as generalized estimating equations and generalized linear mixed models.
△ Less
Submitted 27 March, 2025; v1 submitted 31 January, 2024;
originally announced January 2024.
-
On the error control of invariant causal prediction
Authors:
Jinzhou Li,
Jelle J Goeman
Abstract:
Invariant causal prediction (ICP, Peters et al. (2016)) provides a novel way for identifying causal predictors of a response by utilizing heterogeneous data from different environments. One notable advantage of ICP is that it guarantees to make no false causal discoveries with high probability. Such a guarantee, however, can be overly conservative in some applications, resulting in few or no causa…
▽ More
Invariant causal prediction (ICP, Peters et al. (2016)) provides a novel way for identifying causal predictors of a response by utilizing heterogeneous data from different environments. One notable advantage of ICP is that it guarantees to make no false causal discoveries with high probability. Such a guarantee, however, can be overly conservative in some applications, resulting in few or no causal discoveries. This raises a natural question: Can we use less conservative error control guarantees for ICP so that more causal information can be extracted from data? We address this question in the paper. We focus on two commonly used and more liberal guarantees: false discovery rate control and simultaneous true discovery bound. Unexpectedly, we find that false discovery rate does not seem to be a suitable error criterion for ICP. The simultaneous true discovery bound, on the other hand, proves to be an ideal choice, enabling users to explore potential causal predictors and extract more causal information. Importantly, the additional information comes for free, in the sense that no extra assumptions are required and the discoveries from the original ICP approach are fully retained. We demonstrate the practical utility of our method through simulations and a real dataset about the educational attainment of teenagers in the US.
△ Less
Submitted 8 October, 2024; v1 submitted 8 January, 2024;
originally announced January 2024.
-
Selective inference for fMRI cluster-wise analysis, issues, and recommendations for critical vector selection: A comment on Blain et al
Authors:
Angela Andreella,
Anna Vesely,
Weeda Wouter,
Jelle Goeman
Abstract:
Two permutation-based methods for simultaneous inference on the proportion of active voxels in cluster-wise brain imaging analysis have recently been published: Notip (Blain et al. 2022) and pARI (Andreella et al. 2023). Both rely on the definition of a critical vector of ordered p-values, chosen from a family of candidate vectors, but differ in how the family is defined: computed from randomizati…
▽ More
Two permutation-based methods for simultaneous inference on the proportion of active voxels in cluster-wise brain imaging analysis have recently been published: Notip (Blain et al. 2022) and pARI (Andreella et al. 2023). Both rely on the definition of a critical vector of ordered p-values, chosen from a family of candidate vectors, but differ in how the family is defined: computed from randomization of external data for Notip and determined a priori for pARI. These procedures were compared to other proposals in the literature, but an extensive comparison between the two methods is missing due to their parallel publication. We provide such a comparison and find that pARI outperforms Notip if both methods are applied under their recommended settings. However, each method carries different advantages and drawbacks.
△ Less
Submitted 6 February, 2024; v1 submitted 5 July, 2023;
originally announced July 2023.
-
Simultaneous false discovery proportion bounds via knockoffs and closed testing
Authors:
Jinzhou Li,
Marloes H. Maathuis,
Jelle J. Goeman
Abstract:
We propose new methods to obtain simultaneous false discovery proportion bounds for knockoff-based approaches. We first investigate an approach based on Janson and Su's $k$-familywise error rate control method and interpolation. We then generalize it by considering a collection of $k$ values, and show that the bound of Katsevich and Ramdas is a special case of this method and can be uniformly impr…
▽ More
We propose new methods to obtain simultaneous false discovery proportion bounds for knockoff-based approaches. We first investigate an approach based on Janson and Su's $k$-familywise error rate control method and interpolation. We then generalize it by considering a collection of $k$ values, and show that the bound of Katsevich and Ramdas is a special case of this method and can be uniformly improved. Next, we further generalize the method by using closed testing with a multi-weighted-sum local test statistic. This allows us to obtain a further uniform improvement and other generalizations over previous methods. We also develop an efficient shortcut for its implementation. We compare the performance of our proposed methods in simulations and apply them to a data set from the UK Biobank.
△ Less
Submitted 25 February, 2024; v1 submitted 24 December, 2022;
originally announced December 2022.
-
Inference in generalized linear models with robustness to misspecified variances
Authors:
Riccardo De Santis,
Jelle J. Goeman,
Jesse Hemerik,
Samuel Davenport,
Livio Finos
Abstract:
Generalized linear models usually assume a common dispersion parameter, an assumption that is seldom true in practice. Consequently, standard parametric methods may suffer appreciable loss of type I error control. As an alternative, we present a semi-parametric group-invariance method based on sign flipping of score contributions. Our method requires only the correct specification of the mean mode…
▽ More
Generalized linear models usually assume a common dispersion parameter, an assumption that is seldom true in practice. Consequently, standard parametric methods may suffer appreciable loss of type I error control. As an alternative, we present a semi-parametric group-invariance method based on sign flipping of score contributions. Our method requires only the correct specification of the mean model, but is robust against any misspecification of the variance. We present tests for single as well as multiple regression coefficients. The test is asymptotically valid but shows excellent performance in small samples. We illustrate the method using RNA sequencing count data, for which it is difficult to model the overdispersion correctly. The method is available in the R library flipscores.
△ Less
Submitted 13 September, 2024; v1 submitted 28 September, 2022;
originally announced September 2022.
-
Flexible control of the median of the false discovery proportion
Authors:
Jesse Hemerik,
Aldo Solari,
Jelle J Goeman
Abstract:
We introduce a multiple testing procedure that controls the median of the proportion of false discoveries (FDP) in a flexible way. The procedure only requires a vector of p-values as input and is comparable to the Benjamini-Hochberg method, which controls the mean of the FDP. Our method allows freely choosing one or several values of alpha after seeing the data -- unlike Benjamini-Hochberg, which…
▽ More
We introduce a multiple testing procedure that controls the median of the proportion of false discoveries (FDP) in a flexible way. The procedure only requires a vector of p-values as input and is comparable to the Benjamini-Hochberg method, which controls the mean of the FDP. Our method allows freely choosing one or several values of alpha after seeing the data -- unlike Benjamini-Hochberg, which can be very liberal when alpha is chosen post hoc. We prove these claims and illustrate them with simulations. Our procedure is inspired by a popular estimator of the total number of true hypotheses. We adapt this estimator to provide simultaneously median unbiased estimators of the FDP, valid for finite samples. This simultaneity allows for the claimed flexibility. Our approach does not assume independence. The time complexity of our method is linear in the number of hypotheses, after sorting the p-values.
△ Less
Submitted 13 March, 2024; v1 submitted 24 August, 2022;
originally announced August 2022.
-
Cluster extent inference revisited: quantification and localization of brain activity
Authors:
Jelle J. Goeman,
Paweł\ Górecki,
Ramin Monajemi,
Xu Chen,
Thomas E. Nichols,
Wouter Weeda
Abstract:
Cluster inference based on spatial extent thresholding is the most popular analysis method for finding activated brain areas in neuroimaging. However, the method has several well-known issues. While powerful for finding brain regions with some activation, the method as currently defined does not allow any further quantification or localization of signal. In this paper we repair this gap. We show t…
▽ More
Cluster inference based on spatial extent thresholding is the most popular analysis method for finding activated brain areas in neuroimaging. However, the method has several well-known issues. While powerful for finding brain regions with some activation, the method as currently defined does not allow any further quantification or localization of signal. In this paper we repair this gap. We show that cluster-extent inference can be used (1.) to infer the presence of signal in anatomical regions of interest and (2.) to quantify the percentage of active voxels in any cluster or region of interest. These additional inferences come for free, i.e. they do not require any further adjustment of the alpha-level of tests, while retaining full familywise error control. We achieve this extension of the possibilities of cluster inference by an embedding of the method into a closed testing procedure, and solving the graph-theoretic k-separator problem that results from this embedding. The new method can be used in combination with random field theory or permutations. We demonstrate the usefulness of the method in a large-scale application to neuroimaging data from the Neurovault database.
△ Less
Submitted 9 August, 2022;
originally announced August 2022.
-
On Selecting and Conditioning in Multiple Testing and Selective Inference
Authors:
Jelle Goeman,
Aldo Solari
Abstract:
We investigate a class of methods for selective inference that condition on a selection event. Such methods follow a two-stage process. First, a data-driven (sub)collection of hypotheses is chosen from some large universe of hypotheses. Subsequently, inference takes place within this data-driven collection, conditioned on the information that was used for the selection. Examples of such methods in…
▽ More
We investigate a class of methods for selective inference that condition on a selection event. Such methods follow a two-stage process. First, a data-driven (sub)collection of hypotheses is chosen from some large universe of hypotheses. Subsequently, inference takes place within this data-driven collection, conditioned on the information that was used for the selection. Examples of such methods include basic data splitting, as well as modern data carving methods and post-selection inference methods for lasso coefficients based on the polyhedral lemma. In this paper, we adopt a holistic view on such methods, considering the selection, conditioning, and final error control steps together as a single method. From this perspective, we demonstrate that multiple testing methods defined directly on the full universe of hypotheses are always at least as powerful as selective inference methods based on selection and conditioning. This result holds true even when the universe is potentially infinite and only implicitly defined, such as in the case of data splitting. We provide a comprehensive theoretical framework, along with insights, and delve into several case studies to illustrate instances where a shift to a non-selective or unconditional perspective can yield a power gain.
△ Less
Submitted 5 December, 2023; v1 submitted 27 July, 2022;
originally announced July 2022.
-
Adaptive Cluster Thresholding with Spatial Activation Guarantees Using All-resolutions Inference
Authors:
Xu Chen,
Jelle J. Goeman,
Thijmen J. P. Krebs,
Rosa J. Meijer,
Wouter D. Weeda
Abstract:
Classical cluster inference is hampered by the spatial specificity paradox. Given the null-hypothesis of no active voxels, the alternative hypothesis states that there is at least one active voxel in a cluster. Hence, the larger the cluster the less we know about where activation in the cluster is. Rosenblatt et al. (2018) proposed a post-hoc inference method, All-resolutions Inference (ARI), that…
▽ More
Classical cluster inference is hampered by the spatial specificity paradox. Given the null-hypothesis of no active voxels, the alternative hypothesis states that there is at least one active voxel in a cluster. Hence, the larger the cluster the less we know about where activation in the cluster is. Rosenblatt et al. (2018) proposed a post-hoc inference method, All-resolutions Inference (ARI), that addresses this paradox by estimating the number of active voxels of any brain region. ARI allows users to choose arbitrary brain regions and returns a simultaneous lower confidence bound of the true discovery proportion (TDP) for each of them, retaining control of the family-wise error rate. ARI does not, however, guide users to regions with high enough TDP. In this paper, we propose an efficient algorithm that outputs all maximal supra-threshold clusters, for which ARI gives a TDP lower confidence bound that is at least a chosen threshold, for any number of thresholds that need not be chosen a priori nor all at once. After a preprocessing step in linearithmic time, the algorithm only takes linear time in the size of its output. We demonstrate the algorithm with an application to two fMRI datasets. For both datasets, we found several clusters whose TDP confidently meets or exceeds a given threshold in less than a second.
△ Less
Submitted 10 May, 2023; v1 submitted 27 June, 2022;
originally announced June 2022.
-
Resampling-Based Multisplit Inference for High-Dimensional Regression
Authors:
Anna Vesely,
Jelle J. Goeman,
Livio Finos
Abstract:
We propose a novel resampling-based method to construct an asymptotically exact test for any subset of hypotheses on coefficients in high-dimensional linear regression. It can be embedded into any multiple testing procedure to make confidence statements on relevant predictor variables. The method constructs permutation test statistics for any individual hypothesis by means of repeated splits of th…
▽ More
We propose a novel resampling-based method to construct an asymptotically exact test for any subset of hypotheses on coefficients in high-dimensional linear regression. It can be embedded into any multiple testing procedure to make confidence statements on relevant predictor variables. The method constructs permutation test statistics for any individual hypothesis by means of repeated splits of the data and a variable selection technique; then it defines a test for any subset by suitably aggregating its variables' test statistics. The resulting procedure is extremely flexible, as it allows different selection techniques and several combining functions. We present it in two ways: an exact method and an approximate one, that requires less memory usage and shorter computation time, and can be scaled up to higher dimensions. We illustrate the performance of the method with simulations and the analysis of real gene expression data.
△ Less
Submitted 25 May, 2022;
originally announced May 2022.
-
Permutation-Based True Discovery Guarantee by Sum Tests
Authors:
Anna Vesely,
Livio Finos,
Jelle J. Goeman
Abstract:
Sum-based global tests are highly popular in multiple hypothesis testing. In this paper we propose a general closed testing procedure for sum tests, which provides lower confidence bounds for the proportion of true discoveries (TDP), simultaneously over all subsets of hypotheses. These simultaneous inferences come for free, i.e., without any adjustment of the alpha-level, whenever a global test is…
▽ More
Sum-based global tests are highly popular in multiple hypothesis testing. In this paper we propose a general closed testing procedure for sum tests, which provides lower confidence bounds for the proportion of true discoveries (TDP), simultaneously over all subsets of hypotheses. These simultaneous inferences come for free, i.e., without any adjustment of the alpha-level, whenever a global test is used. Our method allows for an exploratory approach, as simultaneity ensures control of the TDP even when the subset of interest is selected post hoc. It adapts to the unknown joint distribution of the data through permutation testing. Any sum test may be employed, depending on the desired power properties. We present an iterative shortcut for the closed testing procedure, based on the branch and bound algorithm, which converges to the full closed testing results, often after few iterations; even if it is stopped early, it controls the TDP. We compare the properties of different choices for the sum test through simulations, then we illustrate the feasibility of the method for high dimensional data on brain imaging and genomics data.
△ Less
Submitted 18 January, 2023; v1 submitted 23 February, 2021;
originally announced February 2021.
-
Large-scale simultaneous inference under dependence
Authors:
Jinjin Tian,
Xu Chen,
Eugene Katsevich,
Jelle Goeman,
Aaditya Ramdas
Abstract:
Simultaneous inference allows for the exploration of data while deciding on criteria for proclaiming discoveries. It was recently proved that all admissible post-hoc inference methods for true discoveries must employ closed testing. In this paper, we investigate efficient closed testing with local tests of a special form: thresholding a function of sums of test scores for the individual hypotheses…
▽ More
Simultaneous inference allows for the exploration of data while deciding on criteria for proclaiming discoveries. It was recently proved that all admissible post-hoc inference methods for true discoveries must employ closed testing. In this paper, we investigate efficient closed testing with local tests of a special form: thresholding a function of sums of test scores for the individual hypotheses. Under this special design, we propose a new statistic that quantifies the cost of multiplicity adjustments, and we develop fast (mostly linear-time) algorithms for post-hoc inference. Paired with recent advances in global null tests based on generalized means, our work instantiates a series of simultaneous inference methods that can handle many dependence structures and signal compositions. We provide guidance on the method choices via theoretical investigation of the conservativeness and sensitivity for different local tests, as well as simulations that find analogous behavior for local tests and full closed testing.
△ Less
Submitted 22 March, 2022; v1 submitted 22 February, 2021;
originally announced February 2021.
-
Permutation-based true discovery proportions for functional Magnetic Resonance Imaging cluster analysis
Authors:
Angela Andreella,
Jesse Hemerik,
Wouter Weeda,
Livio Finos,
Jelle Goeman
Abstract:
We propose a permutation-based method for testing a large collection of hypotheses simultaneously. Our method provides lower bounds for the number of true discoveries in any selected subset of hypotheses. These bounds are simultaneously valid with high confidence. The methodology is particularly useful in functional Magnetic Resonance Imaging cluster analysis, where it provides a confidence statem…
▽ More
We propose a permutation-based method for testing a large collection of hypotheses simultaneously. Our method provides lower bounds for the number of true discoveries in any selected subset of hypotheses. These bounds are simultaneously valid with high confidence. The methodology is particularly useful in functional Magnetic Resonance Imaging cluster analysis, where it provides a confidence statement on the percentage of truly activated voxels within clusters of voxels, avoiding the well-known spatial specificity paradox. We offer a user-friendly tool to estimate the percentage of true discoveries for each cluster while controlling the family-wise error rate for multiple testing and taking into account that the cluster was chosen in a data-driven way. The method adapts to the spatial correlation structure that characterizes functional Magnetic Resonance Imaging data, gaining power over parametric approaches.
△ Less
Submitted 26 January, 2023; v1 submitted 1 December, 2020;
originally announced December 2020.
-
Comparing three groups
Authors:
Jelle Goeman,
Aldo Solari
Abstract:
We revisit simple and powerful methods for multiple pairwise comparisons that can be used in designs with three groups. We argue that the proper choice of method should be determined by the assessment which of the comparisons are considered primary and which are secondary, as determined by subject-matter considerations. We review four different methods that are simple to use with any standard soft…
▽ More
We revisit simple and powerful methods for multiple pairwise comparisons that can be used in designs with three groups. We argue that the proper choice of method should be determined by the assessment which of the comparisons are considered primary and which are secondary, as determined by subject-matter considerations. We review four different methods that are simple to use with any standard software, but are substantially more powerful than frequently-used methods such as an ANOVA test followed by Tukey's method.
△ Less
Submitted 11 May, 2020;
originally announced May 2020.
-
Maximum likelihood estimation in the additive hazards model
Authors:
Chengyuan Lu,
Jelle Goeman,
Hein Putter
Abstract:
The additive hazards model specifies the effect of covariates on the hazard in an additive way, in contrast to the popular Cox model, in which it is multiplicative. As non-parametric model, it offers a very flexible way of modeling time-varying covariate effects. It is most commonly estimated by ordinary least squares. In this paper we consider the case where covariates are bounded, and derive the…
▽ More
The additive hazards model specifies the effect of covariates on the hazard in an additive way, in contrast to the popular Cox model, in which it is multiplicative. As non-parametric model, it offers a very flexible way of modeling time-varying covariate effects. It is most commonly estimated by ordinary least squares. In this paper we consider the case where covariates are bounded, and derive the maximum likelihood estimator under the constraint that the hazard is non-negative for all covariate values in their domain. We describe an efficient algorithm to find the maximum likelihood estimator. The method is contrasted with the ordinary least squares approach in a simulation study, and the method is illustrated on a realistic data set.
△ Less
Submitted 20 January, 2022; v1 submitted 13 April, 2020;
originally announced April 2020.
-
Pathway Testing in Metabolomics with Globaltest, Allowing Post Hoc Choice of Pathways
Authors:
Ningning Xu,
Aldo Solari,
Jelle Goeman
Abstract:
The Globaltest is a powerful test for the global null hypothesis that there is no association between a group of features and a response of interest, which is popular in pathway testing in metabolomics. Evaluating multiple pathways, however, requires multiple testing correction. In this paper, we propose a multiple testing method, based on closed testing, specifically designed for the Globaltest.…
▽ More
The Globaltest is a powerful test for the global null hypothesis that there is no association between a group of features and a response of interest, which is popular in pathway testing in metabolomics. Evaluating multiple pathways, however, requires multiple testing correction. In this paper, we propose a multiple testing method, based on closed testing, specifically designed for the Globaltest. The proposed method controls the family-wise error rate simultaneously over all possible feature sets, and therefore allows post hoc inference, i.e. the researcher may choose the pathway database after seeing the data without jeopardizing error control. To circumvent the exponential computation time of closed testing, we derive a novel shortcut that allows exact closed testing to be performed on the scale of metabolomics data. An R package ctgt is available on CRAN. We illustrate the shortcut on several metabolomics data examples.
△ Less
Submitted 4 February, 2021; v1 submitted 6 January, 2020;
originally announced January 2020.
-
Another look at the Lady Tasting Tea and differences between permutation tests and randomization tests
Authors:
Jesse Hemerik,
Jelle J. Goeman
Abstract:
The statistical literature is known to be inconsistent in the use of the terms "permutation test" and "randomization test". Several authors succesfully argue that these terms should be used to refer to two distinct classes of tests and that there are major conceptual differences between these classes. The present paper explains an important difference in mathematical reasoning between these classe…
▽ More
The statistical literature is known to be inconsistent in the use of the terms "permutation test" and "randomization test". Several authors succesfully argue that these terms should be used to refer to two distinct classes of tests and that there are major conceptual differences between these classes. The present paper explains an important difference in mathematical reasoning between these classes: a permutation test fundamentally requires that the set of permutations has a group structure, in the algebraic sense; the reasoning behind a randomization test is not based on such a group structure and it is possible to use an experimental design that does not correspond to a group. In particular, we can use a randomization scheme where the number of possible treatment patterns is larger than in standard experimental designs. This leads to exact \emph{p}-values of improved resolution, providing increased power for very small significance levels, at the cost of decreased power for larger significance levels. We discuss applications in randomized trials and elsewhere. Further, we explain that Fisher's famous Lady Tasting Tea experiment, which is commonly referred to as the first permutation test, is in fact a randomization test. This distinction is important to avoid confusion and invalid tests.
△ Less
Submitted 6 October, 2020; v1 submitted 5 December, 2019;
originally announced December 2019.
-
Robust testing in generalized linear models by sign-flipping score contributions
Authors:
Jesse Hemerik,
Jelle J Goeman,
Livio Finos
Abstract:
Generalized linear models are often misspecified due to overdispersion, heteroscedasticity and ignored nuisance variables. Existing quasi-likelihood methods for testing in misspecified models often do not provide satisfactory type-I error rate control. We provide a novel semi-parametric test, based on sign-flipping individual score contributions. The tested parameter is allowed to be multi-dimensi…
▽ More
Generalized linear models are often misspecified due to overdispersion, heteroscedasticity and ignored nuisance variables. Existing quasi-likelihood methods for testing in misspecified models often do not provide satisfactory type-I error rate control. We provide a novel semi-parametric test, based on sign-flipping individual score contributions. The tested parameter is allowed to be multi-dimensional and even high-dimensional. Our test is often robust against the mentioned forms of misspecification and provides better type-I error control than its competitors. When nuisance parameters are estimated, our basic test becomes conservative. We show how to take nuisance estimation into account to obtain an asymptotically exact test. Our proposed test is asymptotically equivalent to its parametric counterpart.
△ Less
Submitted 8 May, 2020; v1 submitted 9 September, 2019;
originally announced September 2019.
-
Only Closed Testing Procedures are Admissible for Controlling False Discovery Proportions
Authors:
Jelle Goeman,
Jesse Hemerik,
Aldo Solari
Abstract:
We consider the class of all multiple testing methods controlling tail probabilities of the false discovery proportion, either for one random set or simultaneously for many such sets. This class encompasses methods controlling familywise error rate, generalized familywise error rate, false discovery exceedance, joint error rate, simultaneous control of all false discovery proportions, and others,…
▽ More
We consider the class of all multiple testing methods controlling tail probabilities of the false discovery proportion, either for one random set or simultaneously for many such sets. This class encompasses methods controlling familywise error rate, generalized familywise error rate, false discovery exceedance, joint error rate, simultaneous control of all false discovery proportions, and others, as well as seemingly unrelated methods such as gene set testing in genomics and cluster inference methods in neuroimaging. We show that all such methods are either equivalent to a closed testing method, or are uniformly improved by one. Moreover, we show that a closed testing method is admissible as a method controlling tail probabilities of false discovery proportions if and only if all its local tests are admissible. This implies that, when designing such methods, it is sufficient to restrict attention to closed testing methods only. We demonstrate the practical usefulness of this design principle by constructing a uniform improvement of a recently proposed method.
△ Less
Submitted 29 April, 2022; v1 submitted 15 January, 2019;
originally announced January 2019.
-
Simultaneous Confidence Intervals for Ranks With Application to Ranking Institutions
Authors:
Diaa Al Mohamad,
Jelle J. Goeman,
Erik W. van Zwet
Abstract:
When a ranking of institutions such as medical centers or universities is based on an indicator provided with a standard error, confidence intervals should be calculated to assess the quality of these ranks. We consider the problem of constructing simultaneous confidence intervals for the ranks of means based on an observed sample. For this aim, the only available method from the literature uses M…
▽ More
When a ranking of institutions such as medical centers or universities is based on an indicator provided with a standard error, confidence intervals should be calculated to assess the quality of these ranks. We consider the problem of constructing simultaneous confidence intervals for the ranks of means based on an observed sample. For this aim, the only available method from the literature uses Monte-Carlo simulations and is highly anticonservative especially when the means are close to each other or have ties. We present a novel method based on Tukey's honest significant difference test (HSD). Our new method is on the contrary conservative when there are no ties. By properly rescaling these two methods to the nominal confidence level, they surprisingly perform very similarly. The Monte-Carlo method is however unscalable when the number of institutions is large than 30 to 50 and stays thus anticonservative. We provide extensive simulations to support our claims and the two methods are compared in terms of their simultaneous coverage and their efficiency. We provide a data analysis for 64 hospitals in the Netherlands and compare both methods. Software for our new methods is available online in package ICRanks downloadable from CRAN. Supplementary materials include supplementary R code for the simulations and proofs of the propositions presented in this paper.
△ Less
Submitted 11 December, 2018;
originally announced December 2018.
-
Permutation-based simultaneous confidence bounds for the false discovery proportion
Authors:
Jesse Hemerik,
Aldo Solari,
Jelle J. Goeman
Abstract:
When multiple hypotheses are tested, interest is often in ensuring that the proportion of false discoveries (FDP) is small with high confidence. In this paper, confidence upper bounds for the FDP are constructed, which are simultaneous over all rejection cut-offs. In particular this allows the user to select a set of hypotheses post hoc such that the FDP lies below some constant with high confiden…
▽ More
When multiple hypotheses are tested, interest is often in ensuring that the proportion of false discoveries (FDP) is small with high confidence. In this paper, confidence upper bounds for the FDP are constructed, which are simultaneous over all rejection cut-offs. In particular this allows the user to select a set of hypotheses post hoc such that the FDP lies below some constant with high confidence. Our method uses permutations to account for the dependence structure in the data. So far only Meinshausen provided an exact, permutation-based and computationally feasible method for simultaneous FDP bounds. We provide an exact method, which uniformly improves this procedure. Further, we provide a generalization of this method. It lets the user select the shape of the simultaneous confidence bounds. This gives the user more freedom in determining the power properties of the method. Interestingly, several existing permutation methods, such as Significance Analysis of Microarrays (SAM) and Westfall and Young's maxT method, are obtained as special cases.
△ Less
Submitted 16 August, 2018;
originally announced August 2018.
-
Adaptive Critical Value for Constrained Likelihood Ratio Testing
Authors:
Diaa Al Mohamad,
Jelle J. Goeman,
Erik W. van Zwet,
Eric A. Cator
Abstract:
We present a new way of testing ordered hypotheses against all alternatives which overpowers the classical approach both in simplicity and statistical power. Our new method tests the constrained likelihood ratio statistic against the quantile of one and only one chi-squared random variable with a data-dependent degrees of freedom instead of a mixture of chi-squares. Our new test is proved to have…
▽ More
We present a new way of testing ordered hypotheses against all alternatives which overpowers the classical approach both in simplicity and statistical power. Our new method tests the constrained likelihood ratio statistic against the quantile of one and only one chi-squared random variable with a data-dependent degrees of freedom instead of a mixture of chi-squares. Our new test is proved to have a valid finite-sample significance level $α$ and provides more power especially for sparse alternatives (those with a few or moderate number of null constraints violations) in comparison to the classical approach. Our method is also easier to use than the classical approach which requires to calculate or simulate a set of complicated weights. Two special cases are considered with more details, namely the case of testing orthants $μ_1<0, \cdots, μ_n<0$ and the isotonic case of testing $μ_1<μ_2<μ_3$ against all alternatives. Contours of the difference in power are shown for these examples showing the interest of our new approach.
△ Less
Submitted 25 June, 2018; v1 submitted 4 June, 2018;
originally announced June 2018.
-
Gaining power in multiple testing of interval hypotheses via conditionalization
Authors:
Jules L. Ellis,
Jakub Pecanka,
Jelle Goeman
Abstract:
In this paper we introduce a novel procedure for improving multiple testing procedures (MTPs) under scenarios when the null hypothesis $p$-values tend to be stochastically larger than standard uniform (referred to as 'inflated'). An important class of problems for which this occurs are tests of interval hypotheses. The new procedure starts with a set of $p$-values and discards those with values ab…
▽ More
In this paper we introduce a novel procedure for improving multiple testing procedures (MTPs) under scenarios when the null hypothesis $p$-values tend to be stochastically larger than standard uniform (referred to as 'inflated'). An important class of problems for which this occurs are tests of interval hypotheses. The new procedure starts with a set of $p$-values and discards those with values above a certain pre-selected threshold while the rest are corrected (scaled-up) by the value of the threshold. Subsequently, a chosen family-wise error rate (FWER) or false discovery rate (FDR) MTP is applied to the set of corrected $p$-values only. We prove the general validity of this procedure under independence of $p$-values, and for the special case of the Bonferroni method we formulate several sufficient conditions for the control of the FWER. It is demonstrated that this 'filtering' of $p$-values can yield considerable gains of power under scenarios with inflated null hypotheses $p$-values.
△ Less
Submitted 30 December, 2017;
originally announced January 2018.
-
A shortcut for Hommel's procedure in linearithmic time
Authors:
Rosa Meijer,
Thijmen Krebs,
Aldo Solari,
Jelle Goeman
Abstract:
Hommel's and Hochberg's procedures for familywise error control are both derived as shortcuts in a closed testing procedure with the Simes local test. Hommel's shortcut is exact but takes quadratic time in the number of hypotheses. Hochberg's shortcut takes only linearithmic time, but is conservative. In this paper we present an exact shortcut in linearithmic time, combining the strengths of both…
▽ More
Hommel's and Hochberg's procedures for familywise error control are both derived as shortcuts in a closed testing procedure with the Simes local test. Hommel's shortcut is exact but takes quadratic time in the number of hypotheses. Hochberg's shortcut takes only linearithmic time, but is conservative. In this paper we present an exact shortcut in linearithmic time, combining the strengths of both procedures. The novel shortcut also applies to a robust variant of Hommel's procedure that does not require the assumption of the Simes inequality.
△ Less
Submitted 23 October, 2017;
originally announced October 2017.
-
Simultaneous confidence sets for ranks using the partitioning principle - Technical report
Authors:
Diaa Al Mohamad,
Erik W. van Zwet,
Jelle J. Goeman,
Aldo Solari
Abstract:
Ranking institutions such as medical centers or universities is based on an indicator accompanied with an uncertainty measure such as a standard deviation, and confidence intervals should be calculated to assess the quality of these ranks. We consider the problem of constructing simultaneous confidence intervals for the ranks of centers based on an observed sample. We present in this paper a novel…
▽ More
Ranking institutions such as medical centers or universities is based on an indicator accompanied with an uncertainty measure such as a standard deviation, and confidence intervals should be calculated to assess the quality of these ranks. We consider the problem of constructing simultaneous confidence intervals for the ranks of centers based on an observed sample. We present in this paper a novel method based on multiple testing which uses the partitioning principle and employs the likelihood ratio (LR) test on the partitions. The complexity of the algorithm is super exponential. We present several ways and shortcuts to reduce this complexity. We provide also a polynomial algorithm which produces a very good bracketing for the multiple testing by linearizing the critical value of the LR test. We show that Tukey's Honest Significant Difference (HSD) test can be written as a partitioning procedure. The new methodology has promising properties in the sens that it opens the door in a simple and easy way to construct new methods which may trade the exponential complexity with power of the test or vice versa. In comparison to Tukey's HSD test, the LR test seems to give better results when the centers are close to each others or the uncertainty in the data is high which is confirmed during a simulation study.
△ Less
Submitted 9 August, 2017;
originally announced August 2017.
-
An improvement of Tukey's HSD with application to ranking institutions
Authors:
Diaa Al Mohamad,
Jelle J. Goeman,
Erik W. van Zwet
Abstract:
When a ranking of institutions such as medical centers or universities is based on an indicator provided with a standard error, confidence intervals should be calculated to assess the quality of these ranks. We consider the problem of constructing simultaneous confidence intervals (CIs) for the ranks of centers based on an observed sample. We present a novel method based on Tukey's honest signific…
▽ More
When a ranking of institutions such as medical centers or universities is based on an indicator provided with a standard error, confidence intervals should be calculated to assess the quality of these ranks. We consider the problem of constructing simultaneous confidence intervals (CIs) for the ranks of centers based on an observed sample. We present a novel method based on Tukey's honest significant difference test (HSD) which is the first method to produce valid simultaneous CIs for ranks. Moreover, we introduce a new variant of Tukey's HSD based on the sequential rejection principle. The new algorithm ensures familywise error control, and produces simultaneous confidence intervals for the ranks uniformly shorter than those provided by Tukey's HSD for the same level of significance. We illustrate the method through both simulations and real data analysis from 64 hospitals in the Netherlands. Software for our new methods is available online in package \texttt{ICRanks} downloadable from CRAN. Supplementary materials include supplementary R code for the simulations and proofs of the propositions presented in this paper.
△ Less
Submitted 22 November, 2018; v1 submitted 8 August, 2017;
originally announced August 2017.
-
Simultaneous Control of All False Discovery Proportions in Large-Scale Multiple Hypothesis Testing
Authors:
Jelle Goeman,
Rosa Meijer,
Thijmen Krebs,
Aldo Solari
Abstract:
Closed testing procedures are classically used for familywise error rate (FWER) control, but they can also be used to obtain simultaneous confidence bounds for the false discovery proportion (FDP) in all subsets of the hypotheses. In this paper we investigate the special case of closed testing with Simes local tests. We construct a novel fast and exact shortcut which we use to investigate the powe…
▽ More
Closed testing procedures are classically used for familywise error rate (FWER) control, but they can also be used to obtain simultaneous confidence bounds for the false discovery proportion (FDP) in all subsets of the hypotheses. In this paper we investigate the special case of closed testing with Simes local tests. We construct a novel fast and exact shortcut which we use to investigate the power of this method when the number of hypotheses goes to infinity. We show that, if a minimal amount of signal is present, the average power to detect false hypotheses at any desired FDP level does not vanish. Additionally, we show that the confidence bounds for FDP are consistent estimators for the true FDP for every non-vanishing subset. For the case of a finite number of hypotheses, we show connections between Simes-based closed testing and the procedure of Benjamini and Hochberg.
△ Less
Submitted 23 October, 2017; v1 submitted 21 November, 2016;
originally announced November 2016.
-
Better-Than-Chance Classification for Signal Detection
Authors:
Jonathan D. Rosenblatt,
Yuval Benjamini,
Roee Gilron,
Roy Mukamel,
Jelle J. Goeman
Abstract:
The estimated accuracy of a classifier is a random quantity with variability. A common practice in supervised machine learning, is thus to test if the estimated accuracy is significantly better than chance level. This method of signal detection is particularly popular in neuroimaging and genetics. We provide evidence that using a classifier's accuracy as a test statistic can be an underpowered str…
▽ More
The estimated accuracy of a classifier is a random quantity with variability. A common practice in supervised machine learning, is thus to test if the estimated accuracy is significantly better than chance level. This method of signal detection is particularly popular in neuroimaging and genetics. We provide evidence that using a classifier's accuracy as a test statistic can be an underpowered strategy for finding differences between populations, compared to a bona-fide statistical test. It is also computationally more demanding than a statistical test. Via simulation, we compare test statistics that are based on classification accuracy, to others based on multivariate test statistics. We find that probability of detecting differences between two distributions is lower for accuracy based statistics. We examine several candidate causes for the low power of accuracy tests. These causes include: the discrete nature of the accuracy test statistic, the type of signal accuracy tests are designed to detect, their inefficient use of the data, and their regularization. When the purposes of the analysis is not signal detection, but rather, the evaluation of a particular classifier, we suggest several improvements to increase power. In particular, to replace V-fold cross validation with the Leave-One-Out Bootstrap.
△ Less
Submitted 14 December, 2017; v1 submitted 31 August, 2016;
originally announced August 2016.
-
Analysing multiple types of molecular profiles simultaneously: connecting the needles in the haystack
Authors:
Renée Menezes,
Leila Mohammadi,
Jelle Goeman,
Judith Boer
Abstract:
It has been shown that a random-effects framework can be used to test the association between a gene's expression level and the number of DNA copies of a set of genes. This gene-set modelling framework was later applied to find associations between mRNA expression and microRNA expression, by defining the gene sets using target prediction information.
Here, we extend the model introduced by Menez…
▽ More
It has been shown that a random-effects framework can be used to test the association between a gene's expression level and the number of DNA copies of a set of genes. This gene-set modelling framework was later applied to find associations between mRNA expression and microRNA expression, by defining the gene sets using target prediction information.
Here, we extend the model introduced by Menezes et al (2009) to consider the effect of not just copy number, but also of other molecular profiles such as methylation changes and loss-of-heterozigosity (LOH), on gene expression levels. We will consider again sets of measurements, to improve robustness of results and increase the power to find associations. Our approach can be used genome-wide to find associations, yields a test to help separate true associations from noise and can include confounders.
We apply our method to colon and to breast cancer samples, for which genome-wide copy number, methylation and gene expression profiles are available. Our findings include interesting gene expression-regulating mechanisms, which may involve only one of copy number or methylation, or both for the same samples. We even are able to find effects due to different molecular mechanisms in different samples.
Our method can equally well be applied to cases where other types of molecular (high-dimensional) data are collected, such as LOH, SNP genotype and microRNA expression data. Computationally efficient, it represents a flexible and powerful tool to study associations between high-dimensional datasets. The method is freely available via the SIM BioConductor package.
△ Less
Submitted 8 October, 2015;
originally announced October 2015.
-
Exact testing with random permutations
Authors:
Jesse Hemerik,
Jelle Goeman
Abstract:
When permutation methods are used in practice, often a limited number of random permutations are used to decrease the computational burden. However, most theoretical literature assumes that the whole permutation group is used, and methods based on random permutations tend to be seen as approximate. There exists a very limited amount of literature on exact testing with random permutations and only…
▽ More
When permutation methods are used in practice, often a limited number of random permutations are used to decrease the computational burden. However, most theoretical literature assumes that the whole permutation group is used, and methods based on random permutations tend to be seen as approximate. There exists a very limited amount of literature on exact testing with random permutations and only recently a thorough proof of exactness was given. In this paper we provide an alternative proof, viewing the test as a "conditional Monte Carlo test" as it has been called in the literature. We also provide extensions of the result. Importantly, our results can be used to prove properties of various multiple testing procedures based on random permutations.
△ Less
Submitted 17 August, 2018; v1 submitted 27 November, 2014;
originally announced November 2014.
-
The sequential rejection principle of familywise error control
Authors:
Jelle J. Goeman,
Aldo Solari
Abstract:
Closed testing and partitioning are recognized as fundamental principles of familywise error control. In this paper, we argue that sequential rejection can be considered equally fundamental as a general principle of multiple testing. We present a general sequentially rejective multiple testing procedure and show that many well-known familywise error controlling methods can be constructed as specia…
▽ More
Closed testing and partitioning are recognized as fundamental principles of familywise error control. In this paper, we argue that sequential rejection can be considered equally fundamental as a general principle of multiple testing. We present a general sequentially rejective multiple testing procedure and show that many well-known familywise error controlling methods can be constructed as special cases of this procedure, among which are the procedures of Holm, Shaffer and Hochberg, parallel and serial gatekeeping procedures, modern procedures for multiple testing in graphs, resampling-based multiple testing procedures and even the closed testing and partitioning procedures themselves. We also give a general proof that sequentially rejective multiple testing procedures strongly control the familywise error if they fulfill simple criteria of monotonicity of the critical values and a limited form of weak familywise error control in each single step. The sequential rejection principle gives a novel theoretical perspective on many well-known multiple testing procedures, emphasizing the sequential aspect. Its main practical usefulness is for the development of multiple testing procedures for null hypotheses, possibly logically related, that are structured in a graph. We illustrate this by presenting a uniform improvement of a recently published procedure.
△ Less
Submitted 14 November, 2012;
originally announced November 2012.
-
Rejoinder to "Multiple Testing for Exploratory Research"
Authors:
Jelle J. Goeman,
Aldo Solari
Abstract:
Rejoinder to "Multiple Testing for Exploratory Research" by J. J. Goeman, A. Solari [arXiv:1208.2841].
Rejoinder to "Multiple Testing for Exploratory Research" by J. J. Goeman, A. Solari [arXiv:1208.2841].
△ Less
Submitted 16 August, 2012;
originally announced August 2012.
-
Multiple Testing for Exploratory Research
Authors:
Jelle J. Goeman,
Aldo Solari
Abstract:
Motivated by the practice of exploratory research, we formulate an approach to multiple testing that reverses the conventional roles of the user and the multiple testing procedure. Traditionally, the user chooses the error criterion, and the procedure the resulting rejected set. Instead, we propose to let the user choose the rejected set freely, and to let the multiple testing procedure return a c…
▽ More
Motivated by the practice of exploratory research, we formulate an approach to multiple testing that reverses the conventional roles of the user and the multiple testing procedure. Traditionally, the user chooses the error criterion, and the procedure the resulting rejected set. Instead, we propose to let the user choose the rejected set freely, and to let the multiple testing procedure return a confidence statement on the number of false rejections incurred. In our approach, such confidence statements are simultaneous for all choices of the rejected set, so that post hoc selection of the rejected set does not compromise their validity. The proposed reversal of roles requires nothing more than a review of the familiar closed testing procedure, but with a focus on the non-consonant rejections that this procedure makes. We suggest several shortcuts to avoid the computational problems associated with closed testing.
△ Less
Submitted 2 October, 2013; v1 submitted 14 August, 2012;
originally announced August 2012.