Skip to main content

Showing 1–12 of 12 results for author: Li, J J

Searching in archive q-bio. Search in all archives.
.
  1. arXiv:2502.09574  [pdf, other

    stat.ME q-bio.GN

    Spatial Transcriptomics Iterative Hierarchical Clustering (stIHC): A Novel Method for Identifying Spatial Gene Co-Expression Modules

    Authors: Catherine Higgins, Jingyi Jessica Li, Michelle Carey

    Abstract: Recent advancements in spatial transcriptomics technologies allow researchers to simultaneously measure RNA expression levels for hundreds to thousands of genes while preserving spatial information within tissues, providing critical insights into spatial gene expression patterns, tissue organization, and gene functionality. However, existing methods for clustering spatially variable genes (SVGs) i… ▽ More

    Submitted 13 February, 2025; originally announced February 2025.

  2. arXiv:2501.05012  [pdf, other

    stat.ME q-bio.QM

    SyNPar: Synthetic Null Data Parallelism for High-Power False Discovery Rate Control in High-Dimensional Variable Selection

    Authors: Changhu Wang, Ziheng Zhang, Jingyi Jessica Li

    Abstract: Balancing false discovery rate (FDR) and statistical power to ensure reliable discoveries is a key challenge in high-dimensional variable selection. Although several FDR control methods have been proposed, most involve perturbing the original data, either by concatenating knockoff variables or splitting the data into two halves, both of which can lead to a loss of power. In this paper, we introduc… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

  3. arXiv:2405.18779  [pdf, other

    q-bio.QM stat.AP

    Categorization of 33 computational methods to detect spatially variable genes from spatially resolved transcriptomics data

    Authors: Guanao Yan, Shuo Harper Hua, Jingyi Jessica Li

    Abstract: In the analysis of spatially resolved transcriptomics data, detecting spatially variable genes (SVGs) is crucial. Numerous computational methods exist, but varying SVG definitions and methodologies lead to incomparable results. We review 33 state-of-the-art methods, categorizing SVGs into three types: overall, cell-type-specific, and spatial-domain-marker SVGs. Our review explains the intuitions u… ▽ More

    Submitted 3 October, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

  4. arXiv:2309.13518  [pdf, other

    q-bio.GN

    Categorization and analysis of 14 computational methods for estimating cell potency from single-cell RNA-seq data

    Authors: Qingyang Wang, Zhiqian Zhai, Qiuyu Lian, Dongyuan Song, Jingyi Jessica Li

    Abstract: In single-cell RNA sequencing (scRNA-seq) analysis, a key challenge is inferring hidden cellular dynamics from static cell snapshots. Various computational methods have been developed to address this, focusing on perspectives like pseudotime trajectories, RNA velocities, and estimating the differentiation potential of cells, often referred to as "cell potency." This review summarizes 14 methods fo… ▽ More

    Submitted 30 August, 2024; v1 submitted 23 September, 2023; originally announced September 2023.

  5. Protocol for Executing and Benchmarking Eight Computational Doublet-Detection Methods in Single-Cell RNA Sequencing Data Analysis

    Authors: Nan Miles Xi, Jingyi Jessica Li

    Abstract: The existence of doublets is a key confounder in single-cell RNA sequencing (scRNA-seq) data analysis. Computational methods have been developed for detecting doublets from scRNA-seq data. We developed an R package DoubletCollection to integrate the installation and execution of eight doublet-detection methods. DoubletCollection also provides a unified interface to perform and visualize downstream… ▽ More

    Submitted 25 June, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

    Journal ref: STAR Protocols 2(3) (2021) 100699

  6. Statistical hypothesis testing versus machine-learning binary classification: distinctions and guidelines

    Authors: Jingyi Jessica Li, Xin Tong

    Abstract: Making binary decisions is a common data analytical task in scientific research and industrial applications. In data sciences, there are two related but distinct strategies: hypothesis testing and binary classification. In practice, how to choose between these two strategies can be unclear and rather confusing. Here we summarize key distinctions between these two strategies in three aspects and li… ▽ More

    Submitted 22 August, 2020; v1 submitted 3 July, 2020; originally announced July 2020.

    Journal ref: Patterns 1(7) (2020) 100115

  7. arXiv:1908.07084  [pdf, other

    stat.AP q-bio.GN q-bio.QM

    Issues arising from benchmarking single-cell RNA sequencing imputation methods

    Authors: Wei Vivian Li, Jingyi Jessica Li

    Abstract: On June 25th, 2018, Huang et al. published a computational method SAVER on Nature Methods for imputing dropout gene expression levels in single cell RNA sequencing (scRNA-seq) data. Huang et al. performed a set of comprehensive benchmarking analyses, including comparison with the data from RNA fluorescence in situ hybridization, to demonstrate that SAVER outperformed two existing scRNA-seq imputat… ▽ More

    Submitted 19 August, 2019; originally announced August 2019.

    Comments: 5 pages

  8. Modeling and analysis of RNA-seq data: a review from a statistical perspective

    Authors: Wei Vivian Li, Jingyi Jessica Li

    Abstract: Background: Since the invention of next-generation RNA sequencing (RNA-seq) technologies, they have become a powerful tool to study the presence and quantity of RNA molecules in biological samples and have revolutionized transcriptomic studies. The analysis of RNA-seq data at four different levels (samples, genes, transcripts, and exons) involve multiple statistical and computational questions, so… ▽ More

    Submitted 1 May, 2018; v1 submitted 17 April, 2018; originally announced April 2018.

    Journal ref: Quantitative Biology 6 (2018) 195-209

  9. arXiv:1706.02366  [pdf, other

    q-bio.QM math.DS

    Hybrid statistical and mechanistic mathematical model guides mobile health intervention for chronic pain

    Authors: Sara M. Clifton, Chaeryon Kang, Jingyi Jessica Li, Qi Long, Nirmish Shah, Daniel M. Abrams

    Abstract: Nearly a quarter of visits to the Emergency Department are for conditions that could have been managed via outpatient treatment; improvements that allow patients to quickly recognize and receive appropriate treatment are crucial. The growing popularity of mobile technology creates new opportunities for real-time adaptive medical intervention, and the simultaneous growth of big data sources allows… ▽ More

    Submitted 7 June, 2017; originally announced June 2017.

    Comments: 13 pages, 15 figures, 5 tables

    Journal ref: J Comput Biol. 24(7) (2017) 675-688

  10. arXiv:1603.05915  [pdf, other

    stat.AP q-bio.GN q-bio.QM

    MSIQ: Joint Modeling of Multiple RNA-seq Samples for Accurate Isoform Quantification

    Authors: Wei Vivian Li, Anqi Zhao, Shihua Zhang, Jingyi Jessica Li

    Abstract: Next-generation RNA sequencing (RNA-seq) technology has been widely used to assess full-length RNA isoform abundance in a high-throughput manner. RNA-seq data offer insight into gene expression levels and transcriptome structures, enabling us to better understand the regulation of gene expression and fundamental biological processes. Accurate isoform quantification from RNA-seq data is challenging… ▽ More

    Submitted 2 December, 2017; v1 submitted 18 March, 2016; originally announced March 2016.

    MSC Class: 97K80; 47N30

    Journal ref: Ann. Appl. Stat. 12(1) (2018) 510-539

  11. TROM: A Testing-based Method for Finding Transcriptomic Similarity of Biological Samples

    Authors: Wei Vivian Li, Yiling Chen, Jingyi Jessica Li

    Abstract: Comparative transcriptomics has gained increasing popularity in genomic research thanks to the development of high-throughput technologies including microarray and next-generation RNA sequencing that have generated numerous transcriptomic data. An important question is to understand the conservation and differentiation of biological processes in different species. We propose a testing-based method… ▽ More

    Submitted 30 August, 2016; v1 submitted 19 January, 2016; originally announced January 2016.

    Journal ref: Statistics in Biosciences 9 (2017) 105-136

  12. System Wide Analyses have Underestimated Protein Abundances and the Importance of Transcription in Mammals

    Authors: Jingyi Jessica Li, Peter J. Bickel, Mark D. Biggin

    Abstract: Large scale surveys in mammalian tissue culture cells suggest that the protein expressed at the median abundance is present at 8,000 - 16,000 molecules per cell and that differences in mRNA expression between genes explain only 10-40% of the differences in protein levels. We find, however, that these surveys have significantly underestimated protein abundances and the relative importance of transc… ▽ More

    Submitted 30 January, 2014; v1 submitted 3 December, 2012; originally announced December 2012.

    Comments: v2 adds a model of all gene's protein and mRNA expression. v3 corrects the omission of dataset files. v4 and 5 extends models of all gene's expression. v6 adds two new ribosome footprint datasets. v7 The final version accepted at PeerJ. Adds NIH3T3 ribosome footprint data and removes modeling of all genes protein expression levels

    Journal ref: PeerJ (2014) e270