Showing 1–2 of 2 results for author: Berge, K V d

Search v0.5.6 released 2020-02-24

arXiv:2011.12044 [pdf, other]

q-bio.GN stat.AP stat.ME

Structure learning for zero-inflated counts, with an application to single-cell RNA sequencing data

Authors: Thi Kim Hue Nguyen, Koen Van den Berge, Monica Chiogna, Davide Risso

Abstract: The problem of estimating the structure of a graph from observed data is of growing interest in the context of high-throughput genomic data, and single-cell RNA sequencing in particular. These, however, are challenging applications, since the data consist of high-dimensional counts with high variance and over-abundance of zeros. Here, we present a general framework for learning the structure of a… ▽ More The problem of estimating the structure of a graph from observed data is of growing interest in the context of high-throughput genomic data, and single-cell RNA sequencing in particular. These, however, are challenging applications, since the data consist of high-dimensional counts with high variance and over-abundance of zeros. Here, we present a general framework for learning the structure of a graph from single-cell RNA-seq data, based on the zero-inflated negative binomial distribution. We demonstrate with simulations that our approach is able to retrieve the structure of a graph in a variety of settings and we show the utility of the approach on real data. △ Less

Submitted 14 February, 2023; v1 submitted 24 November, 2020; originally announced November 2020.
arXiv:1810.01509 [pdf, other]

stat.ME math.ST stat.ML

Hierarchical community detection by recursive partitioning

Authors: Tianxi Li, Lihua Lei, Sharmodeep Bhattacharyya, Koen Van den Berge, Purnamrita Sarkar, Peter J. Bickel, Elizaveta Levina

Abstract: The problem of community detection in networks is usually formulated as finding a single partition of the network into some "correct" number of communities. We argue that it is more interpretable and in some regimes more accurate to construct a hierarchical tree of communities instead. This can be done with a simple top-down recursive partitioning algorithm, starting with a single community and se… ▽ More The problem of community detection in networks is usually formulated as finding a single partition of the network into some "correct" number of communities. We argue that it is more interpretable and in some regimes more accurate to construct a hierarchical tree of communities instead. This can be done with a simple top-down recursive partitioning algorithm, starting with a single community and separating the nodes into two communities by spectral clustering repeatedly, until a stopping rule suggests there are no further communities. This class of algorithms is model-free, computationally efficient, and requires no tuning other than selecting a stopping rule. We show that there are regimes where this approach outperforms K-way spectral clustering, and propose a natural framework for analyzing the algorithm's theoretical performance, the binary tree stochastic block model. Under this model, we prove that the algorithm correctly recovers the entire community tree under relatively mild assumptions. We apply the algorithm to a gene network based on gene co-occurrence in 1580 research papers on anemia, and identify six clusters of genes in a meaningful hierarchy. We also illustrate the algorithm on a dataset of statistics papers. △ Less

Submitted 14 May, 2020; v1 submitted 2 October, 2018; originally announced October 2018.

Search v0.5.6 released 2020-02-24