Search | arXiv e-print repository

Block Designs that Provide Optimal Power in the Cochran-Mantel-Haenszel Test

Authors: David Azriel, Adam Kapelner, Abba M. Krieger

Abstract: We consider the asymptotic power performance under local alternatives of the Cochran-Mantel-Haenszel test. Our setting is non-traditional: we investigate randomized experiments that assign subjects via Fisher's blocking design. We show that blocking designs that satisfy a certain balance condition are asymptotically optimal. When the potential outcomes can be ordered, the balance condition is met… ▽ More We consider the asymptotic power performance under local alternatives of the Cochran-Mantel-Haenszel test. Our setting is non-traditional: we investigate randomized experiments that assign subjects via Fisher's blocking design. We show that blocking designs that satisfy a certain balance condition are asymptotically optimal. When the potential outcomes can be ordered, the balance condition is met for all blocking designs with number of blocks going to infinity. More generally, we prove that the pairwise matching design of Greevy et al. (2004) satisfies the balance condition under mild assumptions. In smaller sample sizes, we show a second order effect becomes operational thereby making blocking designs with a smaller number optimal. In practical settings with many covariates, we recommend pairwise matching for its ability to approximate the balance condition. △ Less

Submitted 10 July, 2025; originally announced July 2025.

Comments: 21 pages, 4 figures

arXiv:2402.07247 [pdf, other]

The Pairwise Matching Design is Optimal under Extreme Noise and Assignments

Authors: David Azriel, Abba M. Krieger, Adam Kapelner

Abstract: We consider the general performance of the difference-in-means estimator in an equally-allocated two-arm randomized experiment under common experimental endpoints such as continuous (regression), incidence, proportion, count and uncensored survival. We consider two sources of randomness: the subject-specific assignments and the contribution of unobserved subject-specific measurements. We then exam… ▽ More We consider the general performance of the difference-in-means estimator in an equally-allocated two-arm randomized experiment under common experimental endpoints such as continuous (regression), incidence, proportion, count and uncensored survival. We consider two sources of randomness: the subject-specific assignments and the contribution of unobserved subject-specific measurements. We then examine mean squared error (MSE) performance under a new, more realistic "simultaneous tail criterion". We prove that the pairwise matching design of Greevy et al. (2004) performs best asymptotically under this criterion when compared to other blocking designs. We also prove that the optimal design must be less random than complete randomization and more random than any deterministic, optimized allocation. Theoretical results are supported by simulations in all five response types. △ Less

Submitted 6 November, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

Comments: 17 pages, 2 figures, 2 tables, 7pp supplement. arXiv admin note: text overlap with arXiv:2212.01887

arXiv:2212.01887 [pdf, ps, other]

The Optimality of Blocking Designs in Equally and Unequally Allocated Randomized Experiments with General Response

Authors: David Azriel, Abba M. Krieger, Adam Kapelner

Abstract: We consider the performance of the difference-in-means estimator in a two-arm randomized experiment under common experimental endpoints such as continuous (regression), incidence, proportion and survival. We examine performance under both equal and unequal allocation to treatment groups and we consider both the Neyman randomization model and the population model. We show that in the Neyman model,… ▽ More We consider the performance of the difference-in-means estimator in a two-arm randomized experiment under common experimental endpoints such as continuous (regression), incidence, proportion and survival. We examine performance under both equal and unequal allocation to treatment groups and we consider both the Neyman randomization model and the population model. We show that in the Neyman model, where the only source of randomness is the treatment manipulation, there is no free lunch: complete randomization is minimax for the estimator's mean squared error. In the population model, where each subject experiences response noise with zero mean, the optimal design is the deterministic perfect-balance allocation. However, this allocation is generally NP-hard to compute and moreover, depends on unknown response parameters. When considering the tail criterion of Kapelner et al. (2021), we show the optimal design is less random than complete randomization and more random than the deterministic perfect-balance allocation. We prove that Fisher's blocking design provides the asymptotically optimal degree of experimental randomness. Theoretical results are supported by simulations in all considered experimental settings. △ Less

Submitted 6 July, 2025; v1 submitted 4 December, 2022; originally announced December 2022.

Comments: 39 pages, 2 figures, 2 tables

arXiv:2209.00490 [pdf, other]

The Role of Pairwise Matching in Experimental Design for an Incidence Outcome

Authors: Adam Kapelner, Abba M. Krieger, David Azriel

Abstract: We consider the problem of evaluating designs for a two-arm randomized experiment with an incidence (binary) outcome under a nonparametric general response model. Our two main results are that the priori pair matching design of Greevy et al. (2004) is (1) the optimal design as measured by mean squared error among all block designs which includes complete randomization. And (2), this pair-matching… ▽ More We consider the problem of evaluating designs for a two-arm randomized experiment with an incidence (binary) outcome under a nonparametric general response model. Our two main results are that the priori pair matching design of Greevy et al. (2004) is (1) the optimal design as measured by mean squared error among all block designs which includes complete randomization. And (2), this pair-matching design is minimax, i.e. it provides the lowest mean squared error under an adversarial response model. Theoretical results are supported by simulations and clinical trial data. △ Less

Submitted 1 September, 2022; originally announced September 2022.

Comments: 23 pages, 2 figures

arXiv:2012.03330 [pdf, other]

Better Experimental Design by Hybridizing Binary Matching with Imbalance Optimization

Authors: Abba M. Krieger, David Azriel, Adam Kapelner

Abstract: We present a new experimental design procedure that divides a set of experimental units into two groups in order to minimize error in estimating an additive treatment effect. One concern is minimizing error at the experimental design stage is large covariate imbalance between the two groups. Another concern is robustness of design to misspecification in response models. We address both concerns in… ▽ More We present a new experimental design procedure that divides a set of experimental units into two groups in order to minimize error in estimating an additive treatment effect. One concern is minimizing error at the experimental design stage is large covariate imbalance between the two groups. Another concern is robustness of design to misspecification in response models. We address both concerns in our proposed design: we first place subjects into pairs using optimal nonbipartite matching, making our estimator robust to complicated non-linear response models. Our innovation is to keep the matched pairs extant, take differences of the covariate values within each matched pair and then we use the greedy switching heuristic of Krieger et al. (2019) or rerandomization on these differences. This latter step greatly reduce covariate imbalance to the rate $O_p(n^{-4})$ in the case of one covariate that are uniformly distributed. This rate benefits from the greedy switching heuristic which is $O_p(n^{-3})$ and the rate of matching which is $O_p(n^{-1})$. Further, our resultant designs are shown to be as random as matching which is robust to unobserved covariates. When compared to previous designs, our approach exhibits significant improvement in the mean squared error of the treatment effect estimator when the response model is nonlinear and performs at least as well when it the response model is linear. Our design procedure is found as a method in the open source R package available on CRAN called GreedyExperimentalDesign. △ Less

Submitted 1 February, 2021; v1 submitted 6 December, 2020; originally announced December 2020.

Comments: 18 pages, 2 tables, 2 figures

arXiv:2008.05980 [pdf, other]

Improving the Power of the Randomization Test

Authors: Abba M. Krieger, David Azriel, Michael Sklar, Adam Kapelner

Abstract: We consider the problem of evaluating designs for a two-arm randomized experiment with the criterion being the power of the randomization test for the one-sided null hypothesis. Our evaluation assumes a response that is linear in one observed covariate, an unobserved component and an additive treatment effect where the only randomness comes from the treatment allocations. It is well-known that the… ▽ More We consider the problem of evaluating designs for a two-arm randomized experiment with the criterion being the power of the randomization test for the one-sided null hypothesis. Our evaluation assumes a response that is linear in one observed covariate, an unobserved component and an additive treatment effect where the only randomness comes from the treatment allocations. It is well-known that the power depends on the allocations' imbalance in the observed covariate and this is the reason for the classic restricted designs such as rerandomization. We show that power is also affected by two other design choices: the number of allocations in the design and the degree of linear dependence among the allocations. We prove that the more allocations, the higher the power and the lower the variability in the power. Designs that feature greater independence of allocations are also shown to have higher performance. Our theoretical findings and extensive simulation studies imply that the designs with the highest power provide thousands of highly independent allocations that each provide nominal imbalance in the observed covariates. These high powered designs exhibit less randomization than complete randomization and more randomization than recently proposed designs based on numerical optimization. Model choices for a practicing experimenter are rerandomization and greedy pair switching, where both outperform complete randomization and numerical optimization. The tradeoff we find also provides a means to specify the imbalance threshold parameter when rerandomizing. △ Less

Submitted 13 August, 2020; originally announced August 2020.

Comments: 31 pages, 6 figures

arXiv:1905.03337 [pdf, other]

Optimal Rerandomization via a Criterion that Provides Insurance Against Failed Experiments

Authors: Adam Kapelner, Abba M. Krieger, Michael Sklar, David Azriel

Abstract: We present an optimized rerandomization design procedure for a non-sequential treatment-control experiment. Randomized experiments are the gold standard for finding causal effects in nature. But sometimes random assignments result in unequal partitions of the treatment and control group visibly seen as imbalance in observed covariates. There can additionally be imbalance on unobserved covariates.… ▽ More We present an optimized rerandomization design procedure for a non-sequential treatment-control experiment. Randomized experiments are the gold standard for finding causal effects in nature. But sometimes random assignments result in unequal partitions of the treatment and control group visibly seen as imbalance in observed covariates. There can additionally be imbalance on unobserved covariates. Imbalance in either observed or unobserved covariates increases treatment effect estimator error inflating the width of confidence regions and reducing experimental power. "Rerandomization" is a strategy that omits poor imbalance assignments by limiting imbalance in the observed covariates to a prespecified threshold. However, limiting this threshold too much can increase the risk of contracting error from unobserved covariates. We introduce a criterion that combines observed imbalance while factoring in the risk of inadvertently imbalancing unobserved covariates. We then use this criterion to locate the optimal rerandomization threshold based on the practitioner's level of desired insurance against high estimator error. We demonstrate the gains of our designs in simulation and in a dataset from a large randomized experiment in education. We provide an open source R package available on CRAN named OptimalRerandExpDesigns which generates designs according to our algorithm. △ Less

Submitted 25 January, 2021; v1 submitted 8 May, 2019; originally announced May 2019.

Comments: 27 pages, 5 figures, 2 tables, 2 algorithms

arXiv:1810.08389 [pdf, other]

Harmonizing Fully Optimal Designs with Classic Randomization in Fixed Trial Experiments

Authors: Adam Kapelner, Abba M. Krieger, Uri Shalit, David Azriel

Abstract: There is a movement in design of experiments away from the classic randomization put forward by Fisher, Cochran and others to one based on optimization. In fixed-sample trials comparing two groups, measurements of subjects are known in advance and subjects can be divided optimally into two groups based on a criterion of homogeneity or "imbalance" between the two groups. These designs are far from… ▽ More There is a movement in design of experiments away from the classic randomization put forward by Fisher, Cochran and others to one based on optimization. In fixed-sample trials comparing two groups, measurements of subjects are known in advance and subjects can be divided optimally into two groups based on a criterion of homogeneity or "imbalance" between the two groups. These designs are far from random. This paper seeks to understand the benefits and the costs over classic randomization in the context of different performance criterions such as Efron's worst-case analysis. In the criterion that we motivate, randomization beats optimization. However, the optimal design is shown to lie between these two extremes. Much-needed further work will provide a procedure to find this optimal designs in different scenarios in practice. Until then, it is best to randomize. △ Less

Submitted 19 October, 2018; originally announced October 2018.

Comments: 19 pages, 4 figures, 2 tables

Showing 1–8 of 8 results for author: Krieger, A M