-
Voting Rights, Markov Chains, and Optimization by Short Bursts
Authors:
Sarah Cannon,
Ari Goldbloom-Helzner,
Varun Gupta,
JN Matthews,
Bhushan Suwal
Abstract:
Finding outlying elements in probability distributions can be a hard problem. Taking a real example from Voting Rights Act enforcement, we consider the problem of maximizing the number of simultaneous majority-minority districts in a political districting plan. An unbiased random walk on districting plans is unlikely to find plans that approach this maximum. A common search approach is to use a bi…
▽ More
Finding outlying elements in probability distributions can be a hard problem. Taking a real example from Voting Rights Act enforcement, we consider the problem of maximizing the number of simultaneous majority-minority districts in a political districting plan. An unbiased random walk on districting plans is unlikely to find plans that approach this maximum. A common search approach is to use a biased random walk: preferentially select districting plans with more majority-minority districts. Here, we present a third option, called short bursts, in which an unbiased random walk is performed for a small number of steps (called the burst length), then re-started from the most extreme plan that was encountered in the last burst. We give empirical evidence that short-burst runs outperform biased random walks for the problem of maximizing the number of majority-minority districts, and that there are many values of burst length for which we see this improvement. Abstracting from our use case, we also consider short bursts where the underlying state space is a line with various probability distributions, and then explore some features of more complicated state spaces and how these impact the effectiveness of short bursts.
△ Less
Submitted 22 June, 2022; v1 submitted 23 October, 2020;
originally announced November 2020.
-
Inconsistent treatment estimates from mis-specified logistic regression analyses of randomized trials
Authors:
J. N. S. Matthews,
Nuri H. Badi
Abstract:
When the difference between treatments in a clinical trial is estimated by a difference in means, then it is well known that randomization ensures unbiassed estimation, even if no account is taken of important baseline covariates. However, when the treatment effect is assessed by other summaries, e.g. by an odds ratio if the outcome is binary, then bias can arise if some covariates are omitted, re…
▽ More
When the difference between treatments in a clinical trial is estimated by a difference in means, then it is well known that randomization ensures unbiassed estimation, even if no account is taken of important baseline covariates. However, when the treatment effect is assessed by other summaries, e.g. by an odds ratio if the outcome is binary, then bias can arise if some covariates are omitted, regardless of the use of randomization for treatment allocation or the size of the trial. We present accurate closed-form approximations for this asymptotic bias when important Normally distributed covariates are omitted from a logistic regression. We compare this approximation with ones in the literature and derive more convenient forms for some of these existing results. The expressions give insight into the form of the bias, which simulations show is usable for distributions other than the Normal. The key result applies even when there are additional binary covariates in the model.
△ Less
Submitted 21 July, 2014;
originally announced July 2014.
-
Statistical Methods for Investigating the Cosmic Ray Energy Spectrum
Authors:
J. D. Hague,
B. R. Becker,
M. S. Gold,
J. A. J. Matthews,
J. Urbář
Abstract:
Two separate statistical tests are described and developed in order to test un-binned data sets for adherence to the power-law form. The first test employs the TP-statistic, a function defined to deviate from zero when the sample deviates from the power-law form, regardless of the value of the power index. The second test employs a likelihood ratio test to reject a power-law background in favor…
▽ More
Two separate statistical tests are described and developed in order to test un-binned data sets for adherence to the power-law form. The first test employs the TP-statistic, a function defined to deviate from zero when the sample deviates from the power-law form, regardless of the value of the power index. The second test employs a likelihood ratio test to reject a power-law background in favor of a model signal distribution with a cut-off.
△ Less
Submitted 18 October, 2007;
originally announced October 2007.
-
Power Laws and the Cosmic Ray Energy Spectrum
Authors:
J. D. Hague,
B. R. Becker,
M. S. Gold,
J. A. J. Matthews
Abstract:
Two separate statistical tests are applied to the AGASA and preliminary Auger Cosmic Ray Energy spectra in an attempt to find deviation from a pure power-law. The first test is constructed from the probability distribution for the maximum event of a sample drawn from a power-law. The second employs the TP-statistic, a function defined to deviate from zero when the sample deviates from the power-…
▽ More
Two separate statistical tests are applied to the AGASA and preliminary Auger Cosmic Ray Energy spectra in an attempt to find deviation from a pure power-law. The first test is constructed from the probability distribution for the maximum event of a sample drawn from a power-law. The second employs the TP-statistic, a function defined to deviate from zero when the sample deviates from the power-law form, regardless of the value of the power index. The AGASA data show no significant deviation from a power-law when subjected to both tests. Applying these tests to the Auger spectrum suggests deviation from a power-law. However, potentially large systematics on the relative energy scale prevent us from drawing definite conclusions at this time.
△ Less
Submitted 30 October, 2006;
originally announced October 2006.