Skip to main content

Showing 1–25 of 25 results for author: Chiquet, J

.
  1. arXiv:2503.22467  [pdf, other

    stat.ME stat.AP stat.CO

    An integrated method for clustering and association network inference

    Authors: Jeanne Tous, Julien Chiquet

    Abstract: We consider high dimensional Gaussian graphical models inference. These models provide a rigorous framework to describe a network of statistical dependencies between entities, such as genes in genomic regulation studies or species in ecology. Penalized methods, including the standard Graphical-Lasso, are well-known approaches to infer the parameters of these models. As the number of variables in t… ▽ More

    Submitted 28 March, 2025; originally announced March 2025.

  2. arXiv:2411.08524  [pdf, other

    stat.ME

    Evaluating Parameter Uncertainty in the Poisson Lognormal Model with Corrected Variational Estimators

    Authors: Bastien Batardière, Julien Chiquet, Mahendra Mariadassou

    Abstract: Count data analysis is essential across diverse fields, from ecology and accident analysis to single-cell RNA sequencing (scRNA-seq) and metagenomics. While log transformations are computationally efficient, model-based approaches such as the Poisson-Log-Normal (PLN) model provide robust statistical foundations and are more amenable to extensions. The PLN model, with its latent Gaussian structure,… ▽ More

    Submitted 13 November, 2024; originally announced November 2024.

  3. Importance sampling-based gradient method for dimension reduction in Poisson log-normal model

    Authors: Bastien Batardière, Julien Chiquet, Joon Kwon, Julien Stoehr

    Abstract: High-dimensional count data poses significant challenges for statistical analysis, necessitating effective methods that also preserve explainability. We focus on a low rank constrained variant of the Poisson log-normal model, which relates the observed data to a latent low-dimensional multivariate Gaussian variable via a Poisson distribution. Variational inference methods have become a golden stan… ▽ More

    Submitted 23 April, 2025; v1 submitted 1 October, 2024; originally announced October 2024.

    Journal ref: Electronic Journal of Statistics. Vol. 19 (1), pp. 2199-2238, 2025

  4. arXiv:2405.14711  [pdf, other

    stat.ME stat.AP stat.ML

    Zero-inflation in the Multivariate Poisson Lognormal Family

    Authors: Bastien Batardière, Julien Chiquet, François Gindraud, Mahendra Mariadassou

    Abstract: Analyzing high-dimensional count data is a challenge and statistical model-based approaches provide an adequate and efficient framework that preserves explainability. The (multivariate) Poisson-Log-Normal (PLN) model is one such model: it assumes count data are driven by an underlying structured latent Gaussian variable, so that the dependencies between counts solely stems from the latent dependen… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 27 pages including appendices. 8 figures, 1 table

  5. Automated calibration of consensus weighted distance-based clustering approaches using sharp

    Authors: Barbara Bodinier, Dragana Vuckovic, Sabrina Rodrigues, Sarah Filippi, Julien Chiquet, Marc Chadeau-Hyam

    Abstract: In consensus clustering, a clustering algorithm is used in combination with a subsampling procedure to detect stable clusters. Previous studies on both simulated and real data suggest that consensus clustering outperforms native algorithms. We extend here consensus clustering to allow for attribute weighting in the calculation of pairwise distances using existing regularised approaches. We propose… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

  6. arXiv:2201.13053  [pdf, other

    math.PR math.ST stat.ML

    A Probabilistic Graph Coupling View of Dimension Reduction

    Authors: Hugues Van Assel, Thibault Espinasse, Julien Chiquet, Franck Picard

    Abstract: Most popular dimension reduction (DR) methods like t-SNE and UMAP are based on minimizing a cost between input and latent pairwise similarities. Though widely used, these approaches lack clear probabilistic foundations to enable a full understanding of their properties and limitations. To that extent, we introduce a unifying statistical framework based on the coupling of hidden graphs using cross… ▽ More

    Submitted 5 October, 2023; v1 submitted 31 January, 2022; originally announced January 2022.

  7. Automated calibration for stability selection in penalised regression and graphical models

    Authors: Barbara Bodinier, Sarah Filippi, Therese Haugdahl Nost, Julien Chiquet, Marc Chadeau-Hyam

    Abstract: Stability selection represents an attractive approach to identify sparse sets of features jointly associated with an outcome in high-dimensional contexts. We introduce an automated calibration procedure via maximisation of an in-house stability score and accommodating a priori-known block structure (e.g. multi-OMIC) data. It applies to (LASSO) penalised regression and graphical models. Simulations… ▽ More

    Submitted 22 February, 2023; v1 submitted 4 June, 2021; originally announced June 2021.

    Comments: Main paper 21 pages, SI: 17 pages

    MSC Class: 92D30; ACM Class: I.6; J.3

  8. arXiv:2011.08708  [pdf, other

    stat.ME

    Adjusting the adjusted Rand Index -- A multinomial story

    Authors: Martina Sundqvist, Julien Chiquet, Guillem Rigaill

    Abstract: The Adjusted Rand Index ($ARI$) is arguably one of the most popular measures for cluster comparison. The adjustment of the $ARI$ is based on a hypergeometric distribution assumption which is unsatisfying from a modeling perspective as (i) it is not appropriate when the two clusterings are dependent, (ii) it forces the size of the clusters, and (iii) it ignores randomness of the sampling. In this w… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

    Comments: 21 pages, 2 figures

  9. arXiv:2004.08312  [pdf, other

    q-bio.MN q-bio.GN stat.AP

    Identification of deregulated transcription factors involved in subtypes of cancers

    Authors: Magali Champion, Julien Chiquet, Pierre Neuvial, Mohamed Elati, François Radvanyi, Etienne Birmelé

    Abstract: We propose a methodology for the identification of transcription factors involved in the deregulation of genes in tumoral cells. This strategy is based on the inference of a reference gene regulatory network that connects transcription factors to their downstream targets using gene expression data. The behavior of genes in tumor samples is then carefully compared to this network of reference to de… ▽ More

    Submitted 17 April, 2020; originally announced April 2020.

    Journal ref: Proceedings of the 12th International Conference on Bioinformatics and Computational Biology, vol 70, pages 1--10

  10. arXiv:1906.12201  [pdf, other

    stat.CO

    missSBM: An R Package for Handling Missing Values in the Stochastic Block Model

    Authors: Pierre Barbillon, Julien Chiquet, Timothée Tabouy

    Abstract: The Stochastic Block Model (SBM) is a popular probabilistic model for random graphs. It is commonly used for clustering network data by aggregating nodes that share similar connectivity patterns into blocks. When fitting an SBM to a network which is partially observed, it is important to take into account the underlying process that generates the missing values, otherwise the inference may be bias… ▽ More

    Submitted 27 May, 2021; v1 submitted 28 June, 2019; originally announced June 2019.

    Comments: 32 pages

  11. arXiv:1810.12169  [pdf, other

    stat.AP cs.LG math.ST stat.ME

    Fast Computation of Genome-Metagenome Interaction Effects

    Authors: Florent Guinot, Marie Szafranski, Julien Chiquet, Anouk Zancarini, Christine Le Signor, Christophe Mougel, Christophe Ambroise

    Abstract: Motivation. Association studies have been widely used to search for associations between common genetic variants observations and a given phenotype. However, it is now generally accepted that genes and environment must be examined jointly when estimating phenotypic variance. In this work we consider two types of biological markers: genotypic markers, which characterize an observation in terms of i… ▽ More

    Submitted 18 June, 2020; v1 submitted 29 October, 2018; originally announced October 2018.

  12. arXiv:1806.03120  [pdf, other

    stat.ME

    Variational inference for sparse network reconstruction from count data

    Authors: Julien Chiquet, Mahendra Mariadassou, Stéphane Robin

    Abstract: In multivariate statistics, the question of finding direct interactions can be formulated as a problem of network inference - or network reconstruction - for which the Gaussian graphical model (GGM) provides a canonical framework. Unfortunately, the Gaussian assumption does not apply to count data which are encountered in domains such as genomics, social sciences or ecology. To circumvent this l… ▽ More

    Submitted 8 June, 2018; originally announced June 2018.

  13. arXiv:1707.04145  [pdf, other

    math.ST

    Variable selection in multivariate linear models with high-dimensional covariance matrix estimation

    Authors: Marie Perrot-Dockès, Céline Lévy-Leduc, Laure Sansonnet, Julien Chiquet

    Abstract: In this paper, we propose a novel variable selection approach in the framework of multivariate linear models taking into account the dependence that may exist between the responses. It consists in estimating beforehand the covariance matrix of the responses and to plug this estimator in a Lasso criterion, in order to obtain a sparse estimator of the coefficient matrix. The properties of our approa… ▽ More

    Submitted 13 July, 2017; originally announced July 2017.

  14. arXiv:1707.04141  [pdf, other

    stat.ME

    Variational Inference for Stochastic Block Models from Sampled Data

    Authors: Timothée Tabouy, Pierre Barbillon, Julien Chiquet

    Abstract: This paper deals with non-observed dyads during the sampling of a network and consecutive issues in the inference of the Stochastic Block Model (SBM). We review sampling designs and recover Missing At Random (MAR) and Not Missing At Random (NMAR) conditions for the SBM. We introduce variants of the variational EM algorithm for inferring the SBM under various sampling designs (MAR and NMAR) all ava… ▽ More

    Submitted 9 January, 2019; v1 submitted 13 July, 2017; originally announced July 2017.

  15. arXiv:1704.00076  [pdf, other

    stat.AP

    A multivariate variable selection approach for analyzing LC-MS metabolomics data

    Authors: M. Perrot-Dockès, C. Lévy-Leduc, J. Chiquet, L. Sansonnet, M. Brégère, M. -P. Étienne, S. Robin, G. Genta-Jouve

    Abstract: Omic data are characterized by the presence of strong dependence structures that result either from data acquisition or from some underlying biological processes. In metabolomics, for instance, data resulting from Liquid Chromatography-Mass Spectrometry (LC-MS) -- a technique which gives access to a large coverage of metabolites -- exhibit such patterns. These data sets are typically used to find… ▽ More

    Submitted 31 March, 2017; originally announced April 2017.

  16. arXiv:1703.06633  [pdf, other

    stat.ME

    Variational inference for probabilistic Poisson PCA

    Authors: Julien Chiquet, Mahendra Mariadassou, Stéphane Robin

    Abstract: Many application domains such as ecology or genomics have to deal with multivariate non Gaussian observations. A typical example is the joint observation of the respective abundances of a set of species in a series of sites, aiming to understand the co-variations between these species. The Gaussian setting provides a canonical way to model such dependencies, but does not apply in general. We consi… ▽ More

    Submitted 30 April, 2018; v1 submitted 20 March, 2017; originally announced March 2017.

    Comments: 27 pages

  17. arXiv:1603.03593  [pdf, other

    stat.AP

    Fast Detection of Block Boundaries in Block Wise Constant Matrices: An Application to HiC data

    Authors: Vincent Brault, Julien Chiquet, Céline Lévy-Leduc

    Abstract: We propose a novel approach for estimating the location of block boundaries (change-points) in a random matrix consisting of a block wise constant matrix observed in white noise. Our method consists in rephrasing this task as a variable selection issue. We use a penalized least-squares criterion with an $\ell_1$-type penalty for dealing with this issue. We first provide some theoretical results en… ▽ More

    Submitted 11 March, 2016; originally announced March 2016.

    Comments: 35 pages, 19 figures, submitted

    MSC Class: 62-07; 62F30; 62P10; 62J07; 62F12

  18. A model for gene deregulation detection using expression data

    Authors: Thomas Picchetti, Julien Chiquet, Mohamed Elati, Pierre Neuvial, Rémy Nicolle, Etienne Birmelé

    Abstract: In tumoral cells, gene regulation mechanisms are severely altered, and these modifications in the regulations may be characteristic of different subtypes of cancer. However, these alterations do not necessarily induce differential expressions between the subtypes. To answer this question, we propose a statistical methodology to identify the misregulated genes given a reference network and gene exp… ▽ More

    Submitted 8 January, 2016; v1 submitted 21 May, 2015; originally announced May 2015.

    Report number: MAP5 2015-17

  19. arXiv:1407.5915  [pdf, other

    stat.CO

    Fast tree inference with weighted fusion penalties

    Authors: Julien Chiquet, Pierre Gutierrez, Guillem Rigaill

    Abstract: Given a data set with many features observed in a large number of conditions, it is desirable to fuse and aggregate conditions which are similar to ease the interpretation and extract the main characteristics of the data. This paper presents a multidimensional fusion penalty framework to address this question when the number of conditions is large. If the fusion penalty is encoded by an $\ell_q$-n… ▽ More

    Submitted 27 May, 2015; v1 submitted 22 July, 2014; originally announced July 2014.

  20. arXiv:1403.6168  [pdf, other

    stat.ME

    Structured Regularization for conditional Gaussian Graphical Models

    Authors: Julien Chiquet, Tristan Mary-Huard, Stéphane Robin

    Abstract: Conditional Gaussian graphical models (cGGM) are a recent reparametrization of the multivariate linear regression model which explicitly exhibits $i)$ the partial covariances between the predictors and the responses, and $ii)$ the partial covariances between the responses themselves. Such models are particularly suitable for interpretability since partial covariances describe strong relationships… ▽ More

    Submitted 25 September, 2014; v1 submitted 24 March, 2014; originally announced March 2014.

  21. arXiv:1210.2077  [pdf, other

    stat.ML stat.CO

    Sparsity by Worst-Case Penalties

    Authors: Yves Grandvalet, Julien Chiquet, Christophe Ambroise

    Abstract: This paper proposes a new interpretation of sparse penalties such as the elastic-net and the group-lasso. Beyond providing a new viewpoint on these penalization schemes, our approach results in a unified optimization strategy. Our experiments demonstrate that this strategy, implemented on the elastic-net, is computationally extremely efficient for small to medium size problems. Our accompanying so… ▽ More

    Submitted 19 July, 2017; v1 submitted 7 October, 2012; originally announced October 2012.

  22. arXiv:1103.2697  [pdf, ps, other

    stat.ME stat.AP

    Sparsity with sign-coherent groups of variables via the cooperative-Lasso

    Authors: Julien Chiquet, Yves Grandvalet, Camille Charbonnier

    Abstract: We consider the problems of estimation and selection of parameters endowed with a known group structure, when the groups are assumed to be sign-coherent, that is, gathering either nonnegative, nonpositive or null parameters. To tackle this problem, we propose the cooperative-Lasso penalty. We derive the optimality conditions defining the cooperative-Lasso estimate for generalized linear models, an… ▽ More

    Submitted 2 July, 2012; v1 submitted 14 March, 2011; originally announced March 2011.

    Comments: Published in at http://dx.doi.org/10.1214/11-AOAS520 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS520

    Journal ref: Annals of Applied Statistics 2012, Vol. 6, No. 2, 795-830

  23. arXiv:0912.4434  [pdf, other

    stat.ME

    Inferring Multiple Graphical Structures

    Authors: Julien Chiquet, Yves Grandvalet, Christophe Ambroise

    Abstract: Gaussian Graphical Models provide a convenient framework for representing dependencies between variables. Recently, this tool has received a high interest for the discovery of biological networks. The literature focuses on the case where a single network is inferred from a set of measurements, but, as wetlab data is typically scarce, several assays, where the experimental conditions affect interac… ▽ More

    Submitted 12 May, 2010; v1 submitted 22 December, 2009; originally announced December 2009.

  24. Weighted-Lasso for Structured Network Inference from Time Course Data

    Authors: Camille Charbonnier, Julien Chiquet, Christophe Ambroise

    Abstract: We present a weighted-Lasso method to infer the parameters of a first-order vector auto-regressive model that describes time course expression data generated by directed gene-to-gene regulation networks. These networks are assumed to own a prior internal structure of connectivity which drives the inference method. This prior structure can be either derived from prior biological knowledge or infe… ▽ More

    Submitted 9 December, 2009; v1 submitted 9 October, 2009; originally announced October 2009.

    Journal ref: Statistical Applications in Genetics and Molecular Biology: Vol. 9 : Iss. 1, Article 15, 2010.

  25. arXiv:0810.3177  [pdf, other

    stat.ME stat.AP

    Inferring sparse Gaussian graphical models with latent structure

    Authors: Christophe Ambroise, Julien Chiquet, Catherine Matias

    Abstract: Our concern is selecting the concentration matrix's nonzero coefficients for a sparse Gaussian graphical model in a high-dimensional setting. This corresponds to estimating the graph of conditional dependencies between the variables. We describe a novel framework taking into account a latent structure on the concentration matrix. This latent structure is used to drive a penalty matrix and thus t… ▽ More

    Submitted 17 October, 2008; originally announced October 2008.

    Comments: 35 pages, 15 figures

    Journal ref: Electron. J. Statist. Volume 3 (2009), 205-238.