Skip to main content

Showing 1–15 of 15 results for author: Katsevich, E

.
  1. arXiv:2501.03530  [pdf, other

    stat.ME stat.AP

    The permuted score test for robust differential expression analysis

    Authors: Timothy Barry, Ziang Niu, Eugene Katsevich, Xihong Lin

    Abstract: Negative binomial (NB) regression is a popular method for identifying differentially expressed genes in genomics data, such as bulk and single-cell RNA sequencing data. However, NB regression makes stringent parametric and asymptotic assumptions, which can fail to hold in practice, leading to excess false positive and false negative results. We propose the permuted score test, a new strategy for r… ▽ More

    Submitted 6 January, 2025; originally announced January 2025.

  2. arXiv:2409.09512  [pdf, other

    stat.ME

    Doubly robust and computationally efficient high-dimensional variable selection

    Authors: Abhinav Chakraborty, Jeffrey Zhang, Eugene Katsevich

    Abstract: The variable selection problem is to discover which of a large set of predictors is associated with an outcome of interest, conditionally on the other predictors. This problem has been widely studied, but existing approaches lack either power against complex alternatives, robustness to model misspecification, computational efficiency, or quantification of evidence against individual hypotheses. We… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

  3. arXiv:2407.08915  [pdf, other

    math.ST math.PR

    The saddlepoint approximation for averages of conditionally independent random variables

    Authors: Ziang Niu, Jyotishka Ray Choudhury, Eugene Katsevich

    Abstract: Motivated by the application of saddlepoint approximations to resampling-based statistical tests, we prove that the Lugannani-Rice formula has vanishing relative error when applied to approximate conditional tail probabilities of averages of conditionally independent random variables. In a departure from existing work, this result is valid under only sub-exponential assumptions on the summands, an… ▽ More

    Submitted 30 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

  4. arXiv:2407.08911  [pdf, other

    stat.ME stat.AP

    Computationally efficient and statistically accurate conditional independence testing with spaCRT

    Authors: Ziang Niu, Jyotishka Ray Choudhury, Eugene Katsevich

    Abstract: We introduce the saddlepoint approximation-based conditional randomization test (spaCRT), a novel conditional independence test that effectively balances statistical accuracy and computational efficiency, inspired by applications to single-cell CRISPR screens. Resampling-based methods like the distilled conditional randomization test (dCRT) offer statistical precision but at a high computational c… ▽ More

    Submitted 14 September, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

  5. arXiv:2211.14698  [pdf, other

    stat.ME math.ST

    Reconciling model-X and doubly robust approaches to conditional independence testing

    Authors: Ziang Niu, Abhinav Chakraborty, Oliver Dukes, Eugene Katsevich

    Abstract: Model-X approaches to testing conditional independence between a predictor and an outcome variable given a vector of covariates usually assume exact knowledge of the conditional distribution of the predictor given the covariates. Nevertheless, model-X methodologies are often deployed with this conditional distribution learned in sample. We investigate the consequences of this choice through the le… ▽ More

    Submitted 8 February, 2023; v1 submitted 26 November, 2022; originally announced November 2022.

  6. arXiv:2201.01879  [pdf, other

    stat.ME stat.AP

    Exponential family measurement error models for single-cell CRISPR screens

    Authors: Timothy Barry, Kathryn Roeder, Eugene Katsevich

    Abstract: CRISPR genome engineering and single-cell RNA sequencing have accelerated biological discovery. Single-cell CRISPR screens unite these two technologies, linking genetic perturbations in individual cells to changes in gene expression and illuminating regulatory networks underlying diseases. Despite their promise, single-cell CRISPR screens present substantial statistical challenges. We demonstrate… ▽ More

    Submitted 12 March, 2024; v1 submitted 5 January, 2022; originally announced January 2022.

  7. arXiv:2102.11253  [pdf, other

    math.ST stat.ME

    Large-scale simultaneous inference under dependence

    Authors: Jinjin Tian, Xu Chen, Eugene Katsevich, Jelle Goeman, Aaditya Ramdas

    Abstract: Simultaneous inference allows for the exploration of data while deciding on criteria for proclaiming discoveries. It was recently proved that all admissible post-hoc inference methods for true discoveries must employ closed testing. In this paper, we investigate efficient closed testing with local tests of a special form: thresholding a function of sums of test scores for the individual hypotheses… ▽ More

    Submitted 22 March, 2022; v1 submitted 22 February, 2021; originally announced February 2021.

    Comments: 41 pages

  8. arXiv:2006.08482   

    stat.ME stat.ML

    The leave-one-covariate-out conditional randomization test

    Authors: Eugene Katsevich, Aaditya Ramdas

    Abstract: Conditional independence testing is an important problem, yet provably hard without assumptions. One of the assumptions that has become popular of late is called "model-X", where we assume we know the joint distribution of the covariates, but assume nothing about the conditional distribution of the outcome given the covariates. Knockoffs is a popular methodology associated with this framework, but… ▽ More

    Submitted 13 July, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

    Comments: This paper has been withdrawn by the authors, because it has now been merged with (and superseded by) a parallel work arXiv:2006.03980 by Molei Liu and Lucas Janson

  9. arXiv:2006.03980  [pdf, other

    stat.ME

    Fast and Powerful Conditional Randomization Testing via Distillation

    Authors: Molei Liu, Eugene Katsevich, Lucas Janson, Aaditya Ramdas

    Abstract: We consider the problem of conditional independence testing: given a response Y and covariates (X,Z), we test the null hypothesis that Y is independent of X given Z. The conditional randomization test (CRT) was recently proposed as a way to use distributional information about X|Z to exactly (non-asymptotically) control Type-I error using any test statistic in any dimensionality without assuming a… ▽ More

    Submitted 4 June, 2021; v1 submitted 6 June, 2020; originally announced June 2020.

    Comments: This paper has been merged with a parallel work arXiv:2006.08482 by Eugene Katsevich and Aaditya Ramdas

  10. arXiv:2005.05506  [pdf, other

    math.ST stat.ME stat.ML

    On the power of conditional independence testing under model-X

    Authors: Eugene Katsevich, Aaditya Ramdas

    Abstract: For testing conditional independence (CI) of a response Y and a predictor X given covariates Z, the recently introduced model-X (MX) framework has been the subject of active methodological research, especially in the context of MX knockoffs and their successful application to genome-wide association studies. In this paper, we study the power of MX CI tests, yielding quantitative insights into the… ▽ More

    Submitted 29 October, 2022; v1 submitted 11 May, 2020; originally announced May 2020.

  11. arXiv:2003.07236  [pdf, other

    math.AP cond-mat.mtrl-sci cond-mat.stat-mech

    Analysis of a fourth order exponential PDE arising from a crystal surface jump process with Metropolis-type transition rates

    Authors: Yuan Gao, Anya E. Katsevich, Jian-Guo Liu, Jianfeng Lu, Jeremy L. Marzuola

    Abstract: We analytically and numerically study a fourth order PDE modeling rough crystal surface diffusion on the macroscopic level. We discuss existence of solutions globally in time and long time dynamics for the PDE model. The PDE, originally derived by the second author, is the continuum limit of a microscopic model of the surface dynamics, given by a Markov jump process with Metropolis type transition… ▽ More

    Submitted 19 November, 2020; v1 submitted 16 March, 2020; originally announced March 2020.

    Comments: 14 pages, 4 figures, comments welcome! Revised significantly thanks to very thorough referee reports. Some previous discussions have been removed and will be reported in a separate result by one of the authors

    Journal ref: Pure Appl. Analysis 3 (2021) 595-612

  12. arXiv:1809.01792  [pdf, other

    stat.ME

    Filtering the rejection set while preserving false discovery rate control

    Authors: Eugene Katsevich, Chiara Sabatti, Marina Bogomolov

    Abstract: Scientific hypotheses in a variety of applications have domain-specific structures, such as the tree structure of the International Classification of Diseases (ICD), the directed acyclic graph structure of the Gene Ontology (GO), or the spatial structure in genome-wide association studies. In the context of multiple testing, the resulting relationships among hypotheses can create redundancies amon… ▽ More

    Submitted 10 April, 2020; v1 submitted 5 September, 2018; originally announced September 2018.

  13. Simultaneous high-probability bounds on the false discovery proportion in structured, regression, and online settings

    Authors: Eugene Katsevich, Aaditya Ramdas

    Abstract: While traditional multiple testing procedures prohibit adaptive analysis choices made by users, Goeman and Solari (2011) proposed a simultaneous inference framework that allows users such flexibility while preserving high-probability bounds on the false discovery proportion (FDP) of the chosen set. In this paper, we propose a new class of such simultaneous FDP bounds, tailored for nested sequences… ▽ More

    Submitted 1 December, 2019; v1 submitted 18 March, 2018; originally announced March 2018.

    Journal ref: Annals of Statistics 2020, Vol. 48, No. 6, 3465-3487

  14. arXiv:1706.09375  [pdf, other

    stat.ME

    Multilayer Knockoff Filter: Controlled variable selection at multiple resolutions

    Authors: Eugene Katsevich, Chiara Sabatti

    Abstract: We tackle the problem of selecting from among a large number of variables those that are 'important' for an outcome. We consider situations where groups of variables are also of interest in their own right. For example, each variable might be a genetic polymorphism and we might want to study how a trait depends on variability in genes, segments of DNA that typically contain multiple such polymorph… ▽ More

    Submitted 9 August, 2018; v1 submitted 28 June, 2017; originally announced June 2017.

  15. Covariance estimation using conjugate gradient for 3D classification in Cryo-EM

    Authors: Joakim Andén, Eugene Katsevich, Amit Singer

    Abstract: Classifying structural variability in noisy projections of biological macromolecules is a central problem in Cryo-EM. In this work, we build on a previous method for estimating the covariance matrix of the three-dimensional structure present in the molecules being imaged. Our proposed method allows for incorporation of contrast transfer function and non-uniform distribution of viewing angles, maki… ▽ More

    Submitted 11 February, 2015; v1 submitted 2 December, 2014; originally announced December 2014.