Exact and arbitrarily accurate non-parametric two-sample tests based on rank spacings
Authors:
Dan D. Erdmann-Pham,
Jonathan Terhorst,
Yun S. Song
Abstract:
A common method for deriving non-parametric tests is to reformulate a parametric test in terms of sample ranks. Despite being distribution free (even in finite samples), the resulting tests often display remarkable asymptotic power properties, typically matching the efficiency of their parametric counterpart. Empirically, these favorable power properties have been shown to persist in non-asymptoti…
▽ More
A common method for deriving non-parametric tests is to reformulate a parametric test in terms of sample ranks. Despite being distribution free (even in finite samples), the resulting tests often display remarkable asymptotic power properties, typically matching the efficiency of their parametric counterpart. Empirically, these favorable power properties have been shown to persist in non-asymptotic regimes as well, prompting the need for finite-sample characterizations of the corresponding rank-based statistics. Here, we provide such characterization for the family of weighted $p$-norms of rank spacings, which includes the classical tests of Mann-Whitney, Dixon, and various generalizations thereof. For $p=1$, we provide exact expressions for the involved distributions, while for $p>1$ we describe the associated moment sequences and derive an algorithm to recover the distributions of interest from these sequences in a fast and stable manner. We use this framework to develop a new family of non-parametric tests mirroring properties of generalized likelihood-ratios, prove new tail bounds for Dixon's and Greenwood's statistics, and prove a previously formulated conjecture regarding the global efficiency of rank-based tests against the $F$-test in the context of scale-families.
△ Less
Submitted 8 August, 2022; v1 submitted 15 August, 2020;
originally announced August 2020.
The key parameters that govern translation efficiency
Authors:
Dan D. Erdmann-Pham,
Khanh Dao Duc,
Yun S. Song
Abstract:
Translation of mRNA into protein is a fundamental yet complex biological process with multiple factors that can potentially affect its efficiency. Here, we study a stochastic model describing the traffic flow of ribosomes along the mRNA (namely, the inhomogeneous $\ell$-TASEP), and identify the key parameters that govern the overall rate of protein synthesis, sensitivity to initiation rate changes…
▽ More
Translation of mRNA into protein is a fundamental yet complex biological process with multiple factors that can potentially affect its efficiency. Here, we study a stochastic model describing the traffic flow of ribosomes along the mRNA (namely, the inhomogeneous $\ell$-TASEP), and identify the key parameters that govern the overall rate of protein synthesis, sensitivity to initiation rate changes, and efficiency of ribosome usage. By analyzing a continuum limit of the model, we obtain closed-form expressions for stationary currents and ribosomal densities, which agree well with Monte Carlo simulations. Furthermore, we completely characterize the phase transitions in the system, and by applying our theoretical results, we formulate design principles that detail how to tune the key parameters we identified to optimize translation efficiency. Using ribosome profiling data from S. cerevisiae, we shows that its translation system is generally consistent with these principles. Our theoretical results have implications for evolutionary biology, as well as synthetic biology.
△ Less
Submitted 16 January, 2020; v1 submitted 15 March, 2018;
originally announced March 2018.