-
A burn-in(g) question: How long should an initial equal randomization stage be before Bayesian response-adaptive randomization?
Authors:
Edwin Y. N. Tang,
Stef Baas,
Daniel Kaddaj,
Lukas Pin,
David S. Robertson,
Sofía S. Villar
Abstract:
Response-adaptive (RA) trials offer the potential to enhance participant benefit but also complicate valid statistical analysis and potentially lead to a higher proportion of participants receiving an inferior treatment. A common approach to mitigate these disadvantages is to introduce a fixed non-adaptive randomization stage at the start of the RA design, known as the burn-in period. Currently, i…
▽ More
Response-adaptive (RA) trials offer the potential to enhance participant benefit but also complicate valid statistical analysis and potentially lead to a higher proportion of participants receiving an inferior treatment. A common approach to mitigate these disadvantages is to introduce a fixed non-adaptive randomization stage at the start of the RA design, known as the burn-in period. Currently, investigations and guidance on the effect of the burn-in length are scarce. To this end, this paper provides an exact evaluation approach to investigate how the burn-in length impacts the statistical properties of two-arm binary RA designs. We show that (1) for commonly used calibration and asymptotic tests an increase in the burn-in length reduces type I error rate inflation but does not lead to strict type I error rate control, necessitating exact tests; (2) the burn-in length substantially influences the power and participant benefit, and these measures are often not maximized at the maximum or minimum possible burn-in length; (3) the conditional exact test conditioning on total successes provides the highest average and minimum power for both small and moderate burn-in lengths compared to other tests. Using our exact analysis method, we re-design the ARREST trial to improve its statistical properties.
△ Less
Submitted 25 March, 2025;
originally announced March 2025.
-
Matrix Concentration for Random Signed Graphs and Community Recovery in the Signed Stochastic Block Model
Authors:
Sawyer Jack Robertson
Abstract:
We consider graphs where edges and their signs are added independently at random from among all pairs of nodes. We establish strong concentration inequalities for adjacency and Laplacian matrices obtained from this family of random graph models. Then, we apply our results to study graphs sampled from the signed stochastic block model. Namely, we take a two-community setting where edges within the…
▽ More
We consider graphs where edges and their signs are added independently at random from among all pairs of nodes. We establish strong concentration inequalities for adjacency and Laplacian matrices obtained from this family of random graph models. Then, we apply our results to study graphs sampled from the signed stochastic block model. Namely, we take a two-community setting where edges within the communities have positive signs and edges between the communities have negative signs and apply a random sign perturbation with probability $0< s <1/2$. In this setting, our findings include: first, the spectral gap of the corresponding signed Laplacian matrix concentrates near $2s$ with high probability; and second, the sign of the first eigenvector of the Laplacian matrix defines a weakly consistent estimator for the balanced community detection problem, or equivalently, the $\pm 1$ synchronization problem. We supplement our theoretical contributions with experimental data obtained from the models under consideration.
△ Less
Submitted 29 December, 2024;
originally announced December 2024.
-
Thompson, Ulam, or Gauss? Multi-criteria recommendations for posterior probability computation methods in Bayesian response-adaptive trials
Authors:
Daniel Kaddaj,
Lukas Pin,
Stef Baas,
Edwin Y. N. Tang,
David S. Robertson,
Sofía S. Villar
Abstract:
To implement a Bayesian response-adaptive trial it is necessary to evaluate a sequence of posterior probabilities. This sequence is often approximated by simulation due to the unavailability of closed-form formulae to compute it exactly. Approximating these probabilities by simulation can be computationally expensive and impact the accuracy or the range of scenarios that may be explored. An altern…
▽ More
To implement a Bayesian response-adaptive trial it is necessary to evaluate a sequence of posterior probabilities. This sequence is often approximated by simulation due to the unavailability of closed-form formulae to compute it exactly. Approximating these probabilities by simulation can be computationally expensive and impact the accuracy or the range of scenarios that may be explored. An alternative approximation method based on Gaussian distributions can be faster but its accuracy is not guaranteed. The literature lacks practical recommendations for selecting approximation methods and comparing their properties, particularly considering trade-offs between computational speed and accuracy. In this paper, we focus on the case where the trial has a binary endpoint with Beta priors. We first outline an efficient way to compute the posterior probabilities exactly for any number of treatment arms. Then, using exact probability computations, we show how to benchmark calculation methods based on considerations of computational speed, patient benefit, and inferential accuracy. This is done through a range of simulations in the two-armed case, as well as an analysis of the three-armed Established Status Epilepticus Treatment Trial. Finally, we provide practical guidance for which calculation method is most appropriate in different settings, and how to choose the number of simulations if the simulation-based approximation method is used.
△ Less
Submitted 29 November, 2024;
originally announced November 2024.
-
Confidence intervals for adaptive trial designs II: Case study and practical guidance
Authors:
David S. Robertson,
Thomas Burnett,
Babak Choodari-Oskooei,
Munya Dimairo,
Michael Grayling,
Philip Pallmann,
Thomas Jaki
Abstract:
In adaptive clinical trials, the conventional confidence interval (CI) for a treatment effect is prone to undesirable properties such as undercoverage and potential inconsistency with the final hypothesis testing decision. Accordingly, as is stated in recent regulatory guidance on adaptive designs, there is the need for caution in the interpretation of CIs constructed during and after an adaptive…
▽ More
In adaptive clinical trials, the conventional confidence interval (CI) for a treatment effect is prone to undesirable properties such as undercoverage and potential inconsistency with the final hypothesis testing decision. Accordingly, as is stated in recent regulatory guidance on adaptive designs, there is the need for caution in the interpretation of CIs constructed during and after an adaptive clinical trial. However, it may be unclear which of the available CIs in the literature are preferable. This paper is the second in a two-part series that explores CIs for adaptive trials. Part I provided a methodological review of approaches to construct CIs for adaptive designs. In this paper (part II), we present an extended case study based around a two-stage group sequential trial, including a comprehensive simulation study of the proposed CIs for this setting. This facilitates an expanded description of considerations around what makes for an effective CI procedure following an adaptive trial. We show that the CIs can have notably different properties. Finally, we propose a set of guidelines for researchers around the choice of CIs and the reporting of CIs following an adaptive design.
△ Less
Submitted 13 November, 2024;
originally announced November 2024.
-
Confidence intervals for adaptive trial designs I: A methodological review
Authors:
David S. Robertson,
Thomas Burnett,
Babak Choodari-Oskooei,
Munya Dimairo,
Michael Grayling,
Philip Pallmann,
Thomas Jaki
Abstract:
Regulatory guidance notes the need for caution in the interpretation of confidence intervals (CIs) constructed during and after an adaptive clinical trial. Conventional CIs of the treatment effects are prone to undercoverage (as well as other undesirable properties) in many adaptive designs, because they do not take into account the potential and realised trial adaptations. This paper is the first…
▽ More
Regulatory guidance notes the need for caution in the interpretation of confidence intervals (CIs) constructed during and after an adaptive clinical trial. Conventional CIs of the treatment effects are prone to undercoverage (as well as other undesirable properties) in many adaptive designs, because they do not take into account the potential and realised trial adaptations. This paper is the first in a two-part series that explores CIs for adaptive trials. It provides a comprehensive review of the methods to construct CIs for adaptive designs, while the second paper illustrates how to implement these in practice and proposes a set of guidelines for trial statisticians. We describe several classes of techniques for constructing CIs for adaptive clinical trials, before providing a systematic literature review of available methods, classified by the type of adaptive design. As part of this, we assess, through a proposed traffic light system, which of several desirable features of CIs (such as achieving nominal coverage and consistency with the hypothesis test decision) each of these methods holds.
△ Less
Submitted 13 November, 2024;
originally announced November 2024.
-
The use of restricted mean survival time to estimate treatment effect under model misspecification, a simulation study
Authors:
Emily Alger,
David S. Robertson,
Abigail J. Burdon
Abstract:
The use of the non-parametric Restricted Mean Survival Time endpoint (RMST) has grown in popularity as trialists look to analyse time-to-event outcomes without the restrictions of the proportional hazards assumption. In this paper, we evaluate the power and type I error rate of the parametric and non-parametric RMST estimators when treatment effect is explained by multiple covariates, including an…
▽ More
The use of the non-parametric Restricted Mean Survival Time endpoint (RMST) has grown in popularity as trialists look to analyse time-to-event outcomes without the restrictions of the proportional hazards assumption. In this paper, we evaluate the power and type I error rate of the parametric and non-parametric RMST estimators when treatment effect is explained by multiple covariates, including an interaction term. Utilising the RMST estimator in this way allows the combined treatment effect to be summarised as a one-dimensional estimator, which is evaluated using a one-sided hypothesis Z-test. The estimators are either fully specified or misspecified, both in terms of unaccounted covariates or misspecified knot points (where trials exhibit crossing survival curves). A placebo-controlled trial of Gamma interferon is used as a motivating example to simulate associated survival times. When correctly specified, the parametric RMST estimator has the greatest power, regardless of the time of analysis. The misspecified RMST estimator generally performs similarly when covariates mirror those of the fitted case study dataset. However, as the magnitude of the unaccounted covariate increases, the associated power of the estimator decreases. In all cases, the non-parametric RMST estimator has the lowest power, and power remains very reliant on the time of analysis (with a later analysis time correlated with greater power).
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Sensitivity analysis for studies transporting prediction models
Authors:
Jon A. Steingrimsson,
Sarah E. Robertson,
Issa J. Dahabreh
Abstract:
We consider the estimation of measures of model performance in a target population when covariate and outcome data are available on a sample from some source population and covariate data, but not outcome data, are available on a simple random sample from the target population. When outcome data are not available from the target population, identification of measures of model performance is possib…
▽ More
We consider the estimation of measures of model performance in a target population when covariate and outcome data are available on a sample from some source population and covariate data, but not outcome data, are available on a simple random sample from the target population. When outcome data are not available from the target population, identification of measures of model performance is possible under an untestable assumption that the outcome and population (source or target population) are independent conditional on covariates. In practice, this assumption is uncertain and, in some cases, controversial. Therefore, sensitivity analysis may be useful for examining the impact of assumption violations on inferences about model performance. Here, we propose an exponential tilt sensitivity analysis model and develop statistical methods to determine how sensitive measures of model performance are to violations of the assumption of conditional independence between outcome and population. We provide identification results and estimators for the risk in the target population, examine the large-sample properties of the estimators, and apply the estimators to data on individuals with stable ischemic heart disease.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
Generalizability analyses with a partially nested trial design: the Necrotizing Enterocolitis Surgery Trial
Authors:
Sarah E. Robertson,
Matthew A. Rysavy,
Martin L. Blakely,
Jon A. Steingrimsson,
Issa J. Dahabreh
Abstract:
We discuss generalizability analyses under a partially nested trial design, where part of the trial is nested within a cohort of trial-eligible individuals, while the rest of the trial is not nested. This design arises, for example, when only some centers participating in a trial are able to collect data on non-randomized individuals, or when data on non-randomized individuals cannot be collected…
▽ More
We discuss generalizability analyses under a partially nested trial design, where part of the trial is nested within a cohort of trial-eligible individuals, while the rest of the trial is not nested. This design arises, for example, when only some centers participating in a trial are able to collect data on non-randomized individuals, or when data on non-randomized individuals cannot be collected for the full duration of the trial. Our work is motivated by the Necrotizing Enterocolitis Surgery Trial (NEST) that compared initial laparotomy versus peritoneal drain for infants with necrotizing enterocolitis or spontaneous intestinal perforation. During the first phase of the study, data were collected from randomized individuals as well as consenting non-randomized individuals; during the second phase of the study, however, data were only collected from randomized individuals, resulting in a partially nested trial design. We propose methods for generalizability analyses with partially nested trial designs. We describe identification conditions and propose estimators for causal estimands in the target population of all trial-eligible individuals, both randomized and non-randomized, in the part of the data where the trial is nested, while using trial information spanning both parts. We evaluate the estimators in a simulation study.
△ Less
Submitted 1 June, 2023;
originally announced June 2023.
-
Point estimation for adaptive trial designs II: practical considerations and guidance
Authors:
David S. Robertson,
Babak Choodari-Oskooei,
Munya Dimairo,
Laura Flight,
Philip Pallmann,
Thomas Jaki
Abstract:
In adaptive clinical trials, the conventional end-of-trial point estimate of a treatment effect is prone to bias, that is, a systematic tendency to deviate from its true value. As stated in recent FDA guidance on adaptive designs, it is desirable to report estimates of treatment effects that reduce or remove this bias. However, it may be unclear which of the available estimators are preferable, an…
▽ More
In adaptive clinical trials, the conventional end-of-trial point estimate of a treatment effect is prone to bias, that is, a systematic tendency to deviate from its true value. As stated in recent FDA guidance on adaptive designs, it is desirable to report estimates of treatment effects that reduce or remove this bias. However, it may be unclear which of the available estimators are preferable, and their use remains rare in practice. This paper is the second in a two-part series that studies the issue of bias in point estimation for adaptive trials. Part I provided a methodological review of approaches to remove or reduce the potential bias in point estimation for adaptive designs. In part II, we discuss how bias can affect standard estimators and assess the negative impact this can have. We review current practice for reporting point estimates and illustrate the computation of different estimators using a real adaptive trial example (including code), which we use as a basis for a simulation study. We show that while on average the values of these estimators can be similar, for a particular trial realisation they can give noticeably different values for the estimated treatment effect. Finally, we propose guidelines for researchers around the choice of estimators and the reporting of estimates following an adaptive design. The issue of bias should be considered throughout the whole lifecycle of an adaptive design, with the estimation strategy pre-specified in the statistical analysis plan. When available, unbiased or bias-reduced estimates are to be preferred.
△ Less
Submitted 28 November, 2022;
originally announced November 2022.
-
Generalizing and transporting inferences about the effects of treatment assignment subject to non-adherence
Authors:
Issa J. Dahabreh,
Sarah E. Robertson,
Miguel A. Hernán
Abstract:
We discuss the identifiability of causal estimands for generalizability and transportability analyses, both under perfect and imperfect adherence to treatment assignment. We consider a setting where the trial data contain information on baseline covariates, assignment at baseline, intervention at baseline (point treatment), and outcomes; and where the data from non-randomized individuals only cont…
▽ More
We discuss the identifiability of causal estimands for generalizability and transportability analyses, both under perfect and imperfect adherence to treatment assignment. We consider a setting where the trial data contain information on baseline covariates, assignment at baseline, intervention at baseline (point treatment), and outcomes; and where the data from non-randomized individuals only contain information on baseline covariates. In this setting, we review identification results under perfect adherence and study two examples in which non-adherence severely limits the ability to transport inferences about the effects of treatment assignment to the target population. In the first example, trial participation has a direct effect on treatment receipt and, through treatment receipt, on the outcome (a "trial engagement effect" via adherence). In the second example, participation in the trial has unmeasured common causes with treatment receipt. In both examples, the effect of assignment on the outcome in the target population is not identifiable. In the first example, however, the effect of joint interventions to scale-up trial activities that affect adherence and assign treatment is identifiable. We conclude that generalizability and transportability analyses should consider trial engagement effects via adherence and selection for participation on the basis of unmeasured factors that influence adherence.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
Online multiple hypothesis testing
Authors:
David S. Robertson,
James M. S. Wason,
Aaditya Ramdas
Abstract:
Modern data analysis frequently involves large-scale hypothesis testing, which naturally gives rise to the problem of maintaining control of a suitable type I error rate, such as the false discovery rate (FDR). In many biomedical and technological applications, an additional complexity is that hypotheses are tested in an online manner, one-by-one over time. However, traditional procedures that con…
▽ More
Modern data analysis frequently involves large-scale hypothesis testing, which naturally gives rise to the problem of maintaining control of a suitable type I error rate, such as the false discovery rate (FDR). In many biomedical and technological applications, an additional complexity is that hypotheses are tested in an online manner, one-by-one over time. However, traditional procedures that control the FDR, such as the Benjamini-Hochberg procedure, assume that all p-values are available to be tested at a single time point. To address these challenges, a new field of methodology has developed over the past 15 years showing how to control error rates for online multiple hypothesis testing. In this framework, hypotheses arrive in a stream, and at each time point the analyst decides whether to reject the current hypothesis based both on the evidence against it, and on the previous rejection decisions. In this paper, we present a comprehensive exposition of the literature on online error rate control, with a review of key theory as well as a focus on applied examples. We also provide simulation results comparing different online testing algorithms and an up-to-date overview of the many methodological extensions that have been proposed.
△ Less
Submitted 24 July, 2023; v1 submitted 24 August, 2022;
originally announced August 2022.
-
Global sensitivity analysis for studies extending inferences from a randomized trial to a target population
Authors:
Issa J. Dahabreh,
James M. Robins,
Sebastien J-P. A. Haneuse,
Sarah E. Robertson,
Jon A. Steingrimsson,
Miguel A. Hernán
Abstract:
When individuals participating in a randomized trial differ with respect to the distribution of effect modifiers compared compared with the target population where the trial results will be used, treatment effect estimates from the trial may not directly apply to target population. Methods for extending -- generalizing or transporting -- causal inferences from the trial to the target population re…
▽ More
When individuals participating in a randomized trial differ with respect to the distribution of effect modifiers compared compared with the target population where the trial results will be used, treatment effect estimates from the trial may not directly apply to target population. Methods for extending -- generalizing or transporting -- causal inferences from the trial to the target population rely on conditional exchangeability assumptions between randomized and non-randomized individuals. The validity of these assumptions is often uncertain or controversial and investigators need to examine how violation of the assumptions would impact study conclusions. We describe methods for global sensitivity analysis that directly parameterize violations of the assumptions in terms of potential (counterfactual) outcome distributions. Our approach does not require detailed knowledge about the distribution of specific unmeasured effect modifiers or their relationship with the observed variables. We illustrate the methods using data from a trial nested within a cohort of trial-eligible individuals to compare coronary artery surgery plus medical therapy versus medical therapy alone for stable ischemic heart disease.
△ Less
Submitted 20 July, 2022;
originally announced July 2022.
-
Some performance considerations when using multi-armed bandit algorithms in the presence of missing data
Authors:
Xijin Chen,
Kim May Lee,
Sofia S. Villar,
David S. Robertson
Abstract:
When comparing the performance of multi-armed bandit algorithms, the potential impact of missing data is often overlooked. In practice, it also affects their implementation where the simplest approach to overcome this is to continue to sample according to the original bandit algorithm, ignoring missing outcomes. We investigate the impact on performance of this approach to deal with missing data fo…
▽ More
When comparing the performance of multi-armed bandit algorithms, the potential impact of missing data is often overlooked. In practice, it also affects their implementation where the simplest approach to overcome this is to continue to sample according to the original bandit algorithm, ignoring missing outcomes. We investigate the impact on performance of this approach to deal with missing data for several bandit algorithms through an extensive simulation study assuming the rewards are missing at random. We focus on two-armed bandit algorithms with binary outcomes in the context of patient allocation for clinical trials with relatively small sample sizes. However, our results apply to other applications of bandit algorithms where missing data is expected to occur. We assess the resulting operating characteristics, including the expected reward. Different probabilities of missingness in both arms are considered. The key finding of our work is that when using the simplest strategy of ignoring missing data, the impact on the expected performance of multi-armed bandit strategies varies according to the way these strategies balance the exploration-exploitation trade-off. Algorithms that are geared towards exploration continue to assign samples to the arm with more missing responses (which being perceived as the arm with less observed information is deemed more appealing by the algorithm than it would otherwise be). In contrast, algorithms that are geared towards exploitation would rapidly assign a high value to samples from the arms with a current high mean irrespective of the level observations per arm. Furthermore, for algorithms focusing more on exploration, we illustrate that the problem of missing responses can be alleviated using a simple mean imputation approach.
△ Less
Submitted 7 July, 2022; v1 submitted 8 May, 2022;
originally announced May 2022.
-
Cluster randomized trials designed to support generalizable inferences
Authors:
Sarah E. Robertson,
Jon A. Steingrimsson,
Issa J. Dahabreh
Abstract:
Background: When planning a cluster randomized trial, evaluators often have access to an enumerated cohort representing the target population of clusters. Practicalities of conducting the trial, such as the need to oversample clusters with certain characteristics to improve trial economy or to support inference about subgroups of clusters, may preclude simple random sampling from the cohort into t…
▽ More
Background: When planning a cluster randomized trial, evaluators often have access to an enumerated cohort representing the target population of clusters. Practicalities of conducting the trial, such as the need to oversample clusters with certain characteristics to improve trial economy or to support inference about subgroups of clusters, may preclude simple random sampling from the cohort into the trial, and thus interfere with the goal of producing generalizable inferences about the target population.
Methods: We describe a nested trial design where the randomized clusters are embedded within a cohort of trial-eligible clusters from the target population and where clusters are selected for inclusion in the trial with known sampling probabilities that may depend on cluster characteristics (e.g., allowing clusters to be chosen to facilitate trial conduct or to examine hypotheses related to their characteristics). We develop and evaluate methods for analyzing data from this design to generalize causal inferences to the target population underlying the cohort.
Results: We present identification and estimation results for the expectation of the average potential outcome and for the average treatment effect, in the entire target population of clusters and in its non-randomized subset. In simulation studies we show that all the estimators have low bias but markedly different precision.
Conclusions: Cluster randomized trials where clusters are selected for inclusion with known sampling probabilities that depend on cluster characteristics, combined with efficient estimation methods, can precisely quantify treatment effects in the target population, while addressing objectives of trial conduct that require oversampling clusters on the basis of their characteristics.
△ Less
Submitted 17 September, 2024; v1 submitted 6 April, 2022;
originally announced April 2022.
-
Extending inferences from a cluster randomized trial to a target population
Authors:
Issa J. Dahabreh,
Sarah E. Robertson,
Jon A. Steingrimsson,
Stefan Gravenstein,
Nina Joyce
Abstract:
We describe methods that extend (generalize or transport) causal inferences from cluster randomized trials to a target population of clusters, under a general nonparametric model that allows for arbitrary within-cluster dependence. We propose doubly robust estimators of potential outcome means in the target population that exploit individual-level data on covariates and outcomes to improve efficie…
▽ More
We describe methods that extend (generalize or transport) causal inferences from cluster randomized trials to a target population of clusters, under a general nonparametric model that allows for arbitrary within-cluster dependence. We propose doubly robust estimators of potential outcome means in the target population that exploit individual-level data on covariates and outcomes to improve efficiency and are appropriate for use with machine learning methods. We illustrate the methods using a cluster randomized trial of influenza vaccination strategies conducted in 818 nursing homes nested in a cohort of 4,475 trial-eligible Medicare-certified nursing homes.
△ Less
Submitted 28 March, 2022;
originally announced March 2022.
-
Learning about treatment effects in a new target population under transportability assumptions for relative effect measures
Authors:
Issa J. Dahabreh,
Sarah E. Robertson,
Jon A. Steingrimsson
Abstract:
Epidemiologists and applied statisticians often believe that relative effect measures conditional on covariates, such as risk ratios and mean ratios, are ``transportable'' across populations. Here, we examine the identification of causal effects in a target population using an assumption that conditional relative effect measures (e.g., conditional risk ratios or mean ratios) are transportable from…
▽ More
Epidemiologists and applied statisticians often believe that relative effect measures conditional on covariates, such as risk ratios and mean ratios, are ``transportable'' across populations. Here, we examine the identification of causal effects in a target population using an assumption that conditional relative effect measures (e.g., conditional risk ratios or mean ratios) are transportable from a trial to the target population. We show that transportability for relative effect measures is largely incompatible with transportability for difference effect measures, unless the treatment has no effect on average or one is willing to make even stronger transportability assumptions, which imply the transportability of both relative and difference effect measures. We then describe how marginal causal estimands in a target population can be identified under the assumption of transportability of relative effect measures, when we are interested in the effectiveness of a new experimental treatment in a target population where the only treatment in use is the control treatment evaluated in the trial. We extend these results to consider cases where the control treatment evaluated in the trial is only one of the treatments in use in the target population, under an additional partial exchangeability assumption in the target population (i.e., a partial assumption of no unmeasured confounding in the target population). We also develop identification results that allow for the covariates needed for transportability of relative effect measures to be only a small subset of the covariates needed to control confounding in the target population. Last, we propose estimators that can be easily implemented in standard statistical software.
△ Less
Submitted 23 February, 2022;
originally announced February 2022.
-
Online error control for platform trials
Authors:
David S. Robertson,
James M. S. Wason,
Franz König,
Martin Posch,
Thomas Jaki
Abstract:
Platform trials evaluate multiple experimental treatments under a single master protocol, where new treatment arms are added to the trial over time. Given the multiple treatment comparisons, there is the potential for inflation of the overall type I error rate, which is complicated by the fact that the hypotheses are tested at different times and are not all necessarily pre-specified. Online error…
▽ More
Platform trials evaluate multiple experimental treatments under a single master protocol, where new treatment arms are added to the trial over time. Given the multiple treatment comparisons, there is the potential for inflation of the overall type I error rate, which is complicated by the fact that the hypotheses are tested at different times and are not all necessarily pre-specified. Online error control methodology provides a possible solution to the problem of multiplicity for platform trials where a relatively large number of hypotheses are expected to be tested over time. In the online testing framework, hypotheses are tested in a sequential manner, where at each time-step an analyst decides whether to reject the current null hypothesis without knowledge of future tests but based solely on past decisions. Methodology has recently been developed for online control of the false discovery rate as well as the familywise error rate (FWER). In this paper, we describe how to apply online error control to the platform trial setting, present extensive simulation results, and give some recommendations for the use of this new methodology in practice. We show that the algorithms for online error rate control can have a substantially lower FWER than uncorrected testing, while still achieving noticeable gains in power when compared with the use of a Bonferroni procedure. We also illustrate how online error control would have impacted a currently ongoing platform trial.
△ Less
Submitted 8 February, 2022;
originally announced February 2022.
-
Regression-based estimation of heterogeneous treatment effects when extending inferences from a randomized trial to a target population
Authors:
Sarah E Robertson,
Jon A Steingrimsson,
Issa J Dahabreh
Abstract:
Methods for extending -- generalizing or transporting -- inferences from a randomized trial to a target population involve conditioning on a large set of covariates that is sufficient for rendering the randomized and non-randomized groups exchangeable. Yet, decision-makers are often interested in examining treatment effects in subgroups of the target population defined in terms of only a few discr…
▽ More
Methods for extending -- generalizing or transporting -- inferences from a randomized trial to a target population involve conditioning on a large set of covariates that is sufficient for rendering the randomized and non-randomized groups exchangeable. Yet, decision-makers are often interested in examining treatment effects in subgroups of the target population defined in terms of only a few discrete covariates. Here, we propose methods for estimating subgroup-specific potential outcome means and average treatment effects in generalizability and transportability analyses, using outcome model-based (g-formula), weighting, and augmented weighting estimators. We consider estimating subgroup-specific average treatment effects in the target population and its non-randomized subset, and provide methods that are appropriate both for nested and non-nested trial designs. As an illustration, we apply the methods to data from the Coronary Artery Surgery Study to compare the effect of surgery plus medical therapy versus medical therapy alone for chronic coronary artery disease in subgroups defined by history of myocardial infarction.
△ Less
Submitted 30 September, 2021;
originally announced October 2021.
-
Estimating subgroup effects in generalizability and transportability analyses
Authors:
Sarah E. Robertson,
Jon A. Steingrimsson,
Nina R. Joyce,
Elizabeth A. Stuart,
Issa J. Dahabreh
Abstract:
Methods for extending -- generalizing or transporting -- inferences from a randomized trial to a target population involve conditioning on a large set of covariates that is sufficient for rendering the randomized and non-randomized groups exchangeable. Yet, decision-makers are often interested in examining treatment effects in subgroups of the target population defined in terms of only a few discr…
▽ More
Methods for extending -- generalizing or transporting -- inferences from a randomized trial to a target population involve conditioning on a large set of covariates that is sufficient for rendering the randomized and non-randomized groups exchangeable. Yet, decision-makers are often interested in examining treatment effects in subgroups of the target population defined in terms of only a few discrete covariates. Here, we propose methods for estimating subgroup-specific potential outcome means and average treatment effects in generalizability and transportability analyses, using outcome model-based (g-formula), weighting, and augmented weighting estimators. We consider estimating subgroup-specific average treatment effects in the target population and its non-randomized subset, and provide methods that are appropriate both for nested and non-nested trial designs. As an illustration, we apply the methods to data from the Coronary Artery Surgery Study to compare the effect of surgery plus medical therapy versus medical therapy alone for chronic coronary artery disease in subgroups defined by history of myocardial infarction.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
Point estimation for adaptive trial designs I: a methodological review
Authors:
David S. Robertson,
Babak Choodari-Oskooei,
Munya Dimairo,
Laura Flight,
Philip Pallmann,
Thomas Jaki
Abstract:
Recent FDA guidance on adaptive clinical trial designs defines bias as "a systematic tendency for the estimate of treatment effect to deviate from its true value", and states that it is desirable to obtain and report estimates of treatment effects that reduce or remove this bias. The conventional end-of-trial point estimates of the treatment effects are prone to bias in many adaptive designs, beca…
▽ More
Recent FDA guidance on adaptive clinical trial designs defines bias as "a systematic tendency for the estimate of treatment effect to deviate from its true value", and states that it is desirable to obtain and report estimates of treatment effects that reduce or remove this bias. The conventional end-of-trial point estimates of the treatment effects are prone to bias in many adaptive designs, because they do not take into account the potential and realised trial adaptations. While much of the methodological developments on adaptive designs have tended to focus on control of type I error rates and power considerations, in contrast the question of biased estimation has received relatively less attention. This paper is the first in a two-part series that studies the issue of potential bias in point estimation for adaptive trials. Part I provides a comprehensive review of the methods to remove or reduce the potential bias in point estimation of treatment effects for adaptive designs, while part II illustrates how to implement these in practice and proposes a set of guidelines for trial statisticians. The methods reviewed in this paper can be broadly classified into unbiased and bias-reduced estimation, and we also provide a classification of estimators by the type of adaptive design. We compare the proposed methods, highlight available software and code, and discuss potential methodological gaps in the literature.
△ Less
Submitted 28 November, 2022; v1 submitted 18 May, 2021;
originally announced May 2021.
-
Center-specific causal inference with multicenter trials: reinterpreting trial evidence in the context of each participating center
Authors:
Sarah E. Robertson,
Jon A. Steingrimsson,
Nina R. Joyce,
Elizabeth A. Stuart,
Issa J. Dahabreh
Abstract:
In multicenter randomized trials, when effect modifiers have a different distribution across centers, comparisons between treatment groups that average over centers may not apply to any of the populations underlying the individual centers. Here, we describe methods for reinterpreting the evidence produced by a multicenter trial in the context of the population underlying each center. We describe h…
▽ More
In multicenter randomized trials, when effect modifiers have a different distribution across centers, comparisons between treatment groups that average over centers may not apply to any of the populations underlying the individual centers. Here, we describe methods for reinterpreting the evidence produced by a multicenter trial in the context of the population underlying each center. We describe how to identify center-specific effects under identifiability conditions that are largely supported by the study design and when associations between center membership and the outcome may be present, given baseline covariates and treatment ("center-outcome associations"). We then consider an additional condition of no center-outcome associations given baseline covariates and treatment. We show that this condition can be assessed using the trial data; when it holds, center-specific treatment effects can be estimated using analyses that completely pool information across centers. We propose methods for estimating center-specific average treatment effects, when center-outcome associations may be present and when they are absent, and describe approaches for assessing whether center-specific treatment effects are homogeneous. We evaluate the performance of the methods in a simulation study and illustrate their implementation using data from the Hepatitis C Antiviral Long-Term Treatment Against Cirrhosis trial.
△ Less
Submitted 25 April, 2021; v1 submitted 12 April, 2021;
originally announced April 2021.
-
Conditional Power and Friends: The Why and How of (Un)planned, Unblinded Sample Size Recalculations in Confirmatory Trials
Authors:
Kevin Kunzmann,
Michael J. Grayling,
Kim M. Lee,
David S. Robertson,
Kaspar Rufibach,
James M. S. Wason
Abstract:
Adapting the final sample size of a trial to the evidence accruing during the trial is a natural way to address planning uncertainty. Designs with adaptive sample size need to account for their optional stopping to guarantee strict type-I error-rate control. A variety of different methods to maintain type-I error-rate control after unplanned changes of the initial sample size have been proposed in…
▽ More
Adapting the final sample size of a trial to the evidence accruing during the trial is a natural way to address planning uncertainty. Designs with adaptive sample size need to account for their optional stopping to guarantee strict type-I error-rate control. A variety of different methods to maintain type-I error-rate control after unplanned changes of the initial sample size have been proposed in the literature. This makes interim analyses for the purpose of sample size recalculation feasible in a regulatory context. Since the sample size is usually determined via an argument based on the power of the trial, an interim analysis raises the question of how the final sample size should be determined conditional on the accrued information. Conditional power is a concept often put forward in this context. Since it depends on the unknown effect size, we take a strict estimation perspective and compare assumed conditional power, observed conditional power, and predictive power with respect to their properties as estimators of the unknown conditional power. We then demonstrate that pre-planning an interim analysis using methodology for unplanned interim analyses is ineffective and naturally leads to the concept of optimal two-stage designs. We conclude that unplanned design adaptations should only be conducted as reaction to trial-external new evidence, operational needs to violate the originally chosen design, or post hoc changes in the objective criterion. Finally, we show that commonly discussed sample size recalculation rules can lead to paradoxical outcomes and propose two alternative ways of reacting to newly emerging trial-external evidence.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
A review of Bayesian perspectives on sample size derivation for confirmatory trials
Authors:
Kevin Kunzmann,
Michael J. Grayling,
Kim May Lee,
David S. Robertson,
Kaspar Rufibach,
James M. S. Wason
Abstract:
Sample size derivation is a crucial element of the planning phase of any confirmatory trial. A sample size is typically derived based on constraints on the maximal acceptable type I error rate and a minimal desired power. Here, power depends on the unknown true effect size. In practice, power is typically calculated either for the smallest relevant effect size or a likely point alternative. The fo…
▽ More
Sample size derivation is a crucial element of the planning phase of any confirmatory trial. A sample size is typically derived based on constraints on the maximal acceptable type I error rate and a minimal desired power. Here, power depends on the unknown true effect size. In practice, power is typically calculated either for the smallest relevant effect size or a likely point alternative. The former might be problematic if the minimal relevant effect is close to the null, thus requiring an excessively large sample size. The latter is dubious since it does not account for the a priori uncertainty about the likely alternative effect size. A Bayesian perspective on the sample size derivation for a frequentist trial naturally emerges as a way of reconciling arguments about the relative a priori plausibility of alternative effect sizes with ideas based on the relevance of effect sizes. Many suggestions as to how such `hybrid' approaches could be implemented in practice have been put forward in the literature. However, key quantities such as assurance, probability of success, or expected power are often defined in subtly different ways in the literature. Starting from the traditional and entirely frequentist approach to sample size derivation, we derive consistent definitions for the most commonly used `hybrid' quantities and highlight connections, before discussing and demonstrating their use in the context of sample size derivation for clinical trials.
△ Less
Submitted 28 June, 2020;
originally announced June 2020.
-
Response-adaptive randomization in clinical trials: from myths to practical considerations
Authors:
David S. Robertson,
Kim May Lee,
Boryana C. Lopez-Kolkovska,
Sofia S. Villar
Abstract:
Response-Adaptive Randomization (RAR) is part of a wider class of data-dependent sampling algorithms, for which clinical trials are typically used as a motivating application. In that context, patient allocation to treatments is determined by randomization probabilities that change based on the accrued response data in order to achieve experimental goals. RAR has received abundant theoretical atte…
▽ More
Response-Adaptive Randomization (RAR) is part of a wider class of data-dependent sampling algorithms, for which clinical trials are typically used as a motivating application. In that context, patient allocation to treatments is determined by randomization probabilities that change based on the accrued response data in order to achieve experimental goals. RAR has received abundant theoretical attention from the biostatistical literature since the 1930's and has been the subject of numerous debates. In the last decade, it has received renewed consideration from the applied and methodological communities, driven by well-known practical examples and its widespread use in machine learning. Papers on the subject present different views on its usefulness, and these are not easy to reconcile. This work aims to address this gap by providing a unified, broad and fresh review of methodological and practical issues to consider when debating the use of RAR in clinical trials.
△ Less
Submitted 7 June, 2022; v1 submitted 1 May, 2020;
originally announced May 2020.
-
Graphical approaches for the control of generalised error rates
Authors:
David S. Robertson,
James M. S. Wason,
Frank Bretz
Abstract:
When simultaneously testing multiple hypotheses, the usual approach in the context of confirmatory clinical trials is to control the familywise error rate (FWER), which bounds the probability of making at least one false rejection. In many trial settings, these hypotheses will additionally have a hierarchical structure that reflects the relative importance and links between different clinical obje…
▽ More
When simultaneously testing multiple hypotheses, the usual approach in the context of confirmatory clinical trials is to control the familywise error rate (FWER), which bounds the probability of making at least one false rejection. In many trial settings, these hypotheses will additionally have a hierarchical structure that reflects the relative importance and links between different clinical objectives. The graphical approach of Bretz et al. (2009) is a flexible and easily communicable way of controlling the FWER while respecting complex trial objectives and multiple structured hypotheses. However, the FWER can be a very stringent criterion that leads to procedures with low power, and may not be appropriate in exploratory trial settings. This motivates controlling generalised error rates, particularly when the number of hypotheses tested is no longer small. We consider the generalised familywise error rate (k-FWER), which is the probability of making k or more false rejections, as well as the tail probability of the false discovery proportion (FDP), which is the probability that the proportion of false rejections is greater than some threshold. We also consider asymptotic control of the false discovery rate (FDR), which is the expectation of the FDP. In this paper, we show how to control these generalised error rates when using the graphical approach and its extensions. We demonstrate the utility of the resulting graphical procedures on three clinical trial case studies.
△ Less
Submitted 2 June, 2020; v1 submitted 3 April, 2020;
originally announced April 2020.
-
The use of Convolutional Neural Networks for signal-background classification in Particle Physics experiments
Authors:
Venkitesh Ayyar,
Wahid Bhimji,
Lisa Gerhardt,
Sally Robertson,
Zahra Ronaghi
Abstract:
The success of Convolutional Neural Networks (CNNs) in image classification has prompted efforts to study their use for classifying image data obtained in Particle Physics experiments. Here, we discuss our efforts to apply CNNs to 2D and 3D image data from particle physics experiments to classify signal from background.
In this work we present an extensive convolutional neural architecture searc…
▽ More
The success of Convolutional Neural Networks (CNNs) in image classification has prompted efforts to study their use for classifying image data obtained in Particle Physics experiments. Here, we discuss our efforts to apply CNNs to 2D and 3D image data from particle physics experiments to classify signal from background.
In this work we present an extensive convolutional neural architecture search, achieving high accuracy for signal/background discrimination for a HEP classification use-case based on simulated data from the Ice Cube neutrino observatory and an ATLAS-like detector. We demonstrate among other things that we can achieve the same accuracy as complex ResNet architectures with CNNs with less parameters, and present comparisons of computational requirements, training and inference times.
△ Less
Submitted 13 February, 2020;
originally announced February 2020.
-
Efficient and robust methods for causally interpretable meta-analysis: transporting inferences from multiple randomized trials to a target population
Authors:
Issa J. Dahabreh,
Sarah E. Robertson,
Lucia C. Petito,
Miguel A. Hernán,
Jon A. Steingrimsson
Abstract:
We present methods for causally interpretable meta-analyses that combine information from multiple randomized trials to estimate potential (counterfactual) outcome means and average treatment effects in a target population. We consider identifiability conditions, derive implications of the conditions for the law of the observed data, and obtain identification results for transporting causal infere…
▽ More
We present methods for causally interpretable meta-analyses that combine information from multiple randomized trials to estimate potential (counterfactual) outcome means and average treatment effects in a target population. We consider identifiability conditions, derive implications of the conditions for the law of the observed data, and obtain identification results for transporting causal inferences from a collection of independent randomized trials to a new target population in which experimental data may not be available. We propose an estimator for the potential (counterfactual) outcome mean in the target population under each treatment studied in the trials. The estimator uses covariate, treatment, and outcome data from the collection of trials, but only covariate data from the target population sample. We show that it is doubly robust, in the sense that it is consistent and asymptotically normal when at least one of the models it relies on is correctly specified. We study the finite sample properties of the estimator in simulation studies and demonstrate its implementation using data from a multi-center randomized trial.
△ Less
Submitted 4 February, 2022; v1 submitted 24 August, 2019;
originally announced August 2019.
-
Fitting motion models to contextual player behavior
Authors:
Bartholomew Spencer,
Karl Jackson,
Sam Robertson
Abstract:
The objective of this study was to incorporate contextual information into the modelling of player movements. This was achieved by combining the distributions of forthcoming passing contests that players committed to and those they did not. The resultant array measures the probability a player would commit to forthcoming contests in their vicinity. Commitment-based motion models were fit on 46220…
▽ More
The objective of this study was to incorporate contextual information into the modelling of player movements. This was achieved by combining the distributions of forthcoming passing contests that players committed to and those they did not. The resultant array measures the probability a player would commit to forthcoming contests in their vicinity. Commitment-based motion models were fit on 46220 samples of player behavior in the Australian Football League. It was found that the shape of commitment-based models differed greatly to displacement-based models for Australian footballers. Player commitment arrays were used to measure the spatial occupancy and dominance of the attacking team. The spatial characteristics of pass receivers were extracted for 2934 passes. Positional trends in passing were identified. Furthermore, passes were clustered into three components using Gaussian mixture models. Passes in the AFL are most commonly to one-on-one contests or unmarked players. Furthermore, passes were rarely greater than 25 m.
△ Less
Submitted 24 July, 2019;
originally announced July 2019.
-
Sensitivity analysis using bias functions for studies extending inferences from a randomized trial to a target population
Authors:
Issa J. Dahabreh,
James M. Robins,
Sebastien J-P. A. Haneuse,
Iman Saeed,
Sarah E. Robertson,
Elisabeth A. Stuart,
Miguel A. Hernán
Abstract:
Extending (generalizing or transporting) causal inferences from a randomized trial to a target population requires ``generalizability'' or ``transportability'' assumptions, which state that randomized and non-randomized individuals are exchangeable conditional on baseline covariates. These assumptions are made on the basis of background knowledge, which is often uncertain or controversial, and nee…
▽ More
Extending (generalizing or transporting) causal inferences from a randomized trial to a target population requires ``generalizability'' or ``transportability'' assumptions, which state that randomized and non-randomized individuals are exchangeable conditional on baseline covariates. These assumptions are made on the basis of background knowledge, which is often uncertain or controversial, and need to be subjected to sensitivity analysis. We present simple methods for sensitivity analyses that do not require detailed background knowledge about specific unknown or unmeasured determinants of the outcome or modifiers of the treatment effect. Instead, our methods directly parameterize violations of the assumptions using bias functions. We show how the methods can be applied to non-nested trial designs, where the trial data are combined with a separately obtained sample of non-randomized individuals, as well as to nested trial designs, where a clinical trial is embedded within a cohort sampled from the target population. We illustrate the methods using data from a clinical trial comparing treatments for chronic hepatitis C infection.
△ Less
Submitted 25 May, 2019;
originally announced May 2019.
-
Study designs for extending causal inferences from a randomized trial to a target population
Authors:
Issa J. Dahabreh,
Sebastien J-P. A. Haneuse,
James M. Robins,
Sarah E. Robertson,
Ashley L. Buchanan,
Elisabeth A. Stuart,
Miguel A. Hernán
Abstract:
We examine study designs for extending (generalizing or transporting) causal inferences from a randomized trial to a target population. Specifically, we consider nested trial designs, where randomized individuals are nested within a sample from the target population, and non-nested trial designs, including composite dataset designs, where a randomized trial is combined with a separately obtained s…
▽ More
We examine study designs for extending (generalizing or transporting) causal inferences from a randomized trial to a target population. Specifically, we consider nested trial designs, where randomized individuals are nested within a sample from the target population, and non-nested trial designs, including composite dataset designs, where a randomized trial is combined with a separately obtained sample of non-randomized individuals from the target population. We show that the causal quantities that can be identified in each study design depend on what is known about the probability of sampling non-randomized individuals. For each study design, we examine identification of potential outcome means via the g-formula and inverse probability weighting. Last, we explore the implications of the sampling properties underlying the designs for the identification and estimation of the probability of trial participation.
△ Less
Submitted 19 May, 2019;
originally announced May 2019.
-
Towards causally interpretable meta-analysis: transporting inferences from multiple studies to a target population
Authors:
Issa J. Dahabreh,
Lucia C. Petito,
Sarah E. Robertson,
Miguel A. Hernán,
Jon A. Steingrimsson
Abstract:
We take steps towards causally interpretable meta-analysis by describing methods for transporting causal inferences from a collection of randomized trials to a new target population, one-trial-at-a-time and pooling all trials. We discuss identifiability conditions for average treatment effects in the target population and provide identification results. We show that assuming inferences are transpo…
▽ More
We take steps towards causally interpretable meta-analysis by describing methods for transporting causal inferences from a collection of randomized trials to a new target population, one-trial-at-a-time and pooling all trials. We discuss identifiability conditions for average treatment effects in the target population and provide identification results. We show that assuming inferences are transportable from all trials in the collection to the same target population has implications for the law underlying the observed data. We propose average treatment effect estimators that rely on different working models and provide code for their implementation in statistical software. We discuss how to use the data to examine whether transported inferences are homogeneous across the collection of trials, sketch approaches for sensitivity analysis to violations of the identifiability conditions, and describe extensions to address non-adherence in the trials. Last, we illustrate the proposed methods using data from the HALT-C multi-center trial.
△ Less
Submitted 8 February, 2020; v1 submitted 27 March, 2019;
originally announced March 2019.
-
Generalizing trial findings using nested trial designs with sub-sampling of non-randomized individuals
Authors:
Issa J. Dahabreh,
Miguel A. Hernan,
Sarah E. Robertson,
Ashley Buchanan,
Jon A. Steingrimsson
Abstract:
To generalize inferences from a randomized trial to the target population of all trial-eligible individuals, investigators can use nested trial designs, where the randomized individuals are nested within a cohort of trial-eligible individuals, including those who are not offered or refuse randomization. In these designs, data on baseline covariates are collected from the entire cohort, and treatme…
▽ More
To generalize inferences from a randomized trial to the target population of all trial-eligible individuals, investigators can use nested trial designs, where the randomized individuals are nested within a cohort of trial-eligible individuals, including those who are not offered or refuse randomization. In these designs, data on baseline covariates are collected from the entire cohort, and treatment and outcome data need only be collected from randomized individuals. In this paper, we describe nested trial designs that improve research economy by collecting additional baseline covariate data after sub-sampling non-randomized individuals (i.e., a two-stage design), using sampling probabilities that may depend on the initial set of baseline covariates available from all individuals in the cohort. We propose an estimator for the potential outcome mean in the target population of all trial-eligible individuals and show that our estimator is doubly robust, in the sense that it is consistent when either the model for the conditional outcome mean among randomized individuals or the model for the probability of trial participation is correctly specified. We assess the impact of sub-sampling on the asymptotic variance of our estimator and examine the estimator's finite-sample performance in a simulation study. We illustrate the methods using data from the Coronary Artery Surgery Study (CASS).
△ Less
Submitted 7 March, 2019; v1 submitted 16 February, 2019;
originally announced February 2019.
-
Exploring spectro-temporal features in end-to-end convolutional neural networks
Authors:
Sean Robertson,
Gerald Penn,
Yingxue Wang
Abstract:
Triangular, overlapping Mel-scaled filters ("f-banks") are the current standard input for acoustic models that exploit their input's time-frequency geometry, because they provide a psycho-acoustically motivated time-frequency geometry for a speech signal. F-bank coefficients are provably robust to small deformations in the scale. In this paper, we explore two ways in which filter banks can be adju…
▽ More
Triangular, overlapping Mel-scaled filters ("f-banks") are the current standard input for acoustic models that exploit their input's time-frequency geometry, because they provide a psycho-acoustically motivated time-frequency geometry for a speech signal. F-bank coefficients are provably robust to small deformations in the scale. In this paper, we explore two ways in which filter banks can be adjusted for the purposes of speech recognition. First, triangular filters can be replaced with Gabor filters, a compactly supported filter that better localizes events in time, or Gammatone filters, a psychoacoustically-motivated filter. Second, by rearranging the order of operations in computing filter bank features, features can be integrated over smaller time scales while simultaneously providing better frequency resolution. We make all feature implementations available online through open-source repositories. Initial experimentation with a modern end-to-end CNN phone recognizer yielded no significant improvements to phone error rate due to either modification. The result, and its ramifications with respect to learned filter banks, is discussed.
△ Less
Submitted 31 December, 2018;
originally announced January 2019.
-
Online control of the false discovery rate in biomedical research
Authors:
David S. Robertson,
James M. S. Wason
Abstract:
Modern biomedical research frequently involves testing multiple related hypotheses, while maintaining control over a suitable error rate. In many applications the false discovery rate (FDR), which is the expected proportion of false positives among the rejected hypotheses, has become the standard error criterion. Procedures that control the FDR, such as the well-known Benjamini-Hochberg procedure,…
▽ More
Modern biomedical research frequently involves testing multiple related hypotheses, while maintaining control over a suitable error rate. In many applications the false discovery rate (FDR), which is the expected proportion of false positives among the rejected hypotheses, has become the standard error criterion. Procedures that control the FDR, such as the well-known Benjamini-Hochberg procedure, assume that all p-values are available to be tested at a single time point. However, this ignores the sequential nature of many biomedical experiments, where a sequence of hypotheses is tested without having access to future p-values or even the number of hypotheses. Recently, the first procedures that control the FDR in this online manner have been proposed by Javanmard and Montanari (Ann. Stat. 2018), and built upon by Ramdas et al. (NIPS 2017, ICML 2018). In this paper, we compare and contrast these proposed procedures, with a particular focus on the setting where the p-values are dependent. We also propose a simple modification of the procedures for when there is an upper bound on the number of hypotheses to be tested. Using comprehensive simulation scenarios and case studies, we provide recommendations for which procedures to use in practice for online FDR control.
△ Less
Submitted 26 September, 2018; v1 submitted 19 September, 2018;
originally announced September 2018.
-
Extending inferences from a randomized trial to a new target population
Authors:
Issa J. Dahabreh,
Sarah E. Robertson,
Jon A. Steingrimsson,
Elizabeth A. Stuart,
Miguel A. Hernan
Abstract:
When treatment effect modifiers influence the decision to participate in a randomized trial, the average treatment effect in the population represented by the randomized individuals will differ from the effect in other populations. In this tutorial, we consider methods for extending causal inferences about time-fixed treatments from a trial to a new target population of non-participants, using dat…
▽ More
When treatment effect modifiers influence the decision to participate in a randomized trial, the average treatment effect in the population represented by the randomized individuals will differ from the effect in other populations. In this tutorial, we consider methods for extending causal inferences about time-fixed treatments from a trial to a new target population of non-participants, using data from a completed randomized trial and baseline covariate data from a sample from the target population. We examine methods based on modeling the expectation of the outcome, the probability of participation, or both (doubly robust). We compare the methods in a simulation study and show how they can be implemented in software. We apply the methods to a randomized trial nested within a cohort of trial-eligible patients to compare coronary artery surgery plus medical therapy versus medical therapy alone for patients with chronic coronary artery disease. We conclude by discussing issues that arise when using the methods in applied analyses.
△ Less
Submitted 28 October, 2019; v1 submitted 1 May, 2018;
originally announced May 2018.
-
Familywise error control in multi-armed response-adaptive trials
Authors:
David S. Robertson,
James M. S. Wason
Abstract:
Response-adaptive designs allow the randomization probabilities to change during the course of a trial based on cumulated response data, so that a greater proportion of patients can be allocated to the better performing treatments. A major concern over the use of response-adaptive designs in practice, particularly from a regulatory viewpoint, is controlling the type I error rate. In particular, we…
▽ More
Response-adaptive designs allow the randomization probabilities to change during the course of a trial based on cumulated response data, so that a greater proportion of patients can be allocated to the better performing treatments. A major concern over the use of response-adaptive designs in practice, particularly from a regulatory viewpoint, is controlling the type I error rate. In particular, we show that the naive z-test can have an inflated type I error rate even after applying a Bonferroni correction. Simulation studies have often been used to demonstrate error control, but do not provide a guarantee. In this paper, we present adaptive testing procedures for normally distributed outcomes that ensure strong familywise error control, by iteratively applying the conditional invariance principle. Our approach can be used for fully sequential and block randomized trials, and for a large class of adaptive randomization rules found in the literature. We show there is a high price to pay in terms of power to guarantee familywise error control for randomization schemes with extreme allocation probabilities. However, for proposed Bayesian adaptive randomization schemes in the literature, our adaptive tests maintain or increase the power of the trial compared to the z-test. We illustrate our method using a three-armed trial in primary hypercholesterolemia.
△ Less
Submitted 14 March, 2018;
originally announced March 2018.
-
Generalizing causal inferences from individuals in randomized trials to all trial-eligible individuals
Authors:
Issa Dahabreh,
Sarah Robertson,
Eric Tchetgen Tchetgen,
Elizabeth Stuart,
Miguel Hernan
Abstract:
We consider methods for causal inference in randomized trials nested within cohorts of trial-eligible individuals, including those who are not randomized. We show how baseline covariate data from the entire cohort, and treatment and outcome data only from randomized individuals, can be used to identify potential (counterfactual) outcome means and average treatment effects in the target population…
▽ More
We consider methods for causal inference in randomized trials nested within cohorts of trial-eligible individuals, including those who are not randomized. We show how baseline covariate data from the entire cohort, and treatment and outcome data only from randomized individuals, can be used to identify potential (counterfactual) outcome means and average treatment effects in the target population of all eligible individuals. We review identifiability conditions, propose estimators, and assess the estimators' finite-sample performance in simulation studies. As an illustration, we apply the estimators in a trial nested within a cohort of trial-eligible individuals to compare coronary artery bypass grafting surgery plus medical therapy vs. medical therapy alone for chronic coronary artery disease.
△ Less
Submitted 29 October, 2019; v1 submitted 13 September, 2017;
originally announced September 2017.