Skip to main content

Showing 1–18 of 18 results for author: Zwiernik, P

Searching in archive stat. Search in all archives.
.
  1. arXiv:2502.15584  [pdf, other

    math.ST stat.ME

    Improving variable selection properties by leveraging external data

    Authors: Paul Rognon-Vael, David Rossell, Piotr Zwiernik

    Abstract: Sparse high-dimensional signal recovery is only possible under certain conditions on the number of parameters, sample size, signal strength and underlying sparsity. We show that leveraging external information, as possible with data integration or transfer learning, allows to push these mathematical limits. Specifically, we consider external information that allows splitting parameters into blocks… ▽ More

    Submitted 29 March, 2025; v1 submitted 21 February, 2025; originally announced February 2025.

    MSC Class: 62F07 (Primary) 62C12; 62R07 (Secondary)

  2. arXiv:2501.10538  [pdf, other

    cs.LG math.ST stat.ML

    Universality of Benign Overfitting in Binary Linear Classification

    Authors: Ichiro Hashimoto, Stanislav Volgushev, Piotr Zwiernik

    Abstract: The practical success of deep learning has led to the discovery of several surprising phenomena. One of these phenomena, that has spurred intense theoretical research, is ``benign overfitting'': deep neural networks seem to generalize well in the over-parametrized regime even though the networks show a perfect fit to noisy training data. It is now known that benign overfitting also occurs in vario… ▽ More

    Submitted 17 January, 2025; originally announced January 2025.

    Comments: 66 pages, 5 figures

    MSC Class: 14J62

  3. arXiv:2306.03590  [pdf, ps, other

    math.ST stat.ML

    Entropic covariance models

    Authors: Piotr Zwiernik

    Abstract: In covariance matrix estimation, one of the challenges lies in finding a suitable model and an efficient estimation method. Two commonly used modelling approaches in the literature involve imposing linear restrictions on the covariance matrix or its inverse. Another approach considers linear restrictions on the matrix logarithm of the covariance matrix. In this paper, we present a general framewor… ▽ More

    Submitted 7 May, 2024; v1 submitted 6 June, 2023; originally announced June 2023.

    MSC Class: 62H99

  4. arXiv:2210.11107  [pdf, other

    stat.AP

    Graphical model inference with external network data

    Authors: Jack Jewson, Li Li, Laura Battaglia, Stephen Hansen, David Rossell, Piotr Zwiernik

    Abstract: We consider two applications where we study how dependence structure between many variables is linked to external network data. We first study the interplay between social media connectedness and the co-evolution of the COVID-19 pandemic across USA counties. We next study study how the dependence between stock market returns across firms relates to similarities in economic and policy indicators fr… ▽ More

    Submitted 13 November, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

  5. arXiv:2206.13668  [pdf, other

    math.ST stat.ME

    Non-Independent Components Analysis

    Authors: Geert Mesters, Piotr Zwiernik

    Abstract: A seminal result in the ICA literature states that for $AY = \varepsilon$, if the components of $\varepsilon$ are independent and at most one is Gaussian, then $A$ is identified up to sign and permutation of its rows (Comon, 1994). In this paper we study to which extent the independence assumption can be relaxed by replacing it with restrictions on higher order moment or cumulant tensors of… ▽ More

    Submitted 19 March, 2024; v1 submitted 27 June, 2022; originally announced June 2022.

    MSC Class: 15A69; 62H99

  6. arXiv:2112.14727  [pdf, other

    math.ST stat.ME

    Total positivity in multivariate extremes

    Authors: Frank Röttger, Sebastian Engelke, Piotr Zwiernik

    Abstract: Positive dependence is present in many real world data sets and has appealing stochastic properties that can be exploited in statistical modeling and in estimation. In particular, the notion of multivariate total positivity of order 2 ($ \mathrm{MTP}_{2} $) is a convex constraint and acts as an implicit regularizer in the Gaussian case. We study positive dependence in multivariate extremes and int… ▽ More

    Submitted 14 June, 2023; v1 submitted 29 December, 2021; originally announced December 2021.

    Comments: 41 pages, 3 figures

    MSC Class: 60G70; 62H22; 15B48

  7. arXiv:2112.00816  [pdf, other

    stat.ME math.ST

    Maximum Likelihood Estimation for Brownian Motion Tree Models Based on One Sample

    Authors: Michael Truell, Jan-Christian Hütter, Chandler Squires, Piotr Zwiernik, Caroline Uhler

    Abstract: We study the problem of maximum likelihood estimation given one data sample ($n=1$) over Brownian Motion Tree Models (BMTMs), a class of Gaussian models on trees. BMTMs are often used as a null model in phylogenetics, where the one-sample regime is common. Specifically, we show that, almost surely, the one-sample BMTM maximum likelihood estimator (MLE) exists, is unique, and corresponds to a fully… ▽ More

    Submitted 24 November, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

    MSC Class: 62F30; 62H12; 90C39; 62P10

  8. arXiv:2102.05472  [pdf, other

    stat.ML cs.LG math.ST

    Robust estimation of tree structured models

    Authors: Marta Casanellas, Marina Garrote-López, Piotr Zwiernik

    Abstract: Consider the problem of learning undirected graphical models on trees from corrupted data. Recently Katiyar et al. showed that it is possible to recover trees from noisy binary data up to a small equivalence class of possible trees. Their other paper on the Gaussian case follows a similar pattern. By framing this as a special phylogenetic recovery problem we largely generalize these two settings.… ▽ More

    Submitted 10 February, 2021; originally announced February 2021.

    MSC Class: 62H22; 62R01; 60J20

  9. arXiv:2008.04688  [pdf, other

    stat.ME math.ST

    Locally associated graphical models and mixed convex exponential families

    Authors: Steffen Lauritzen, Piotr Zwiernik

    Abstract: The notion of multivariate total positivity has proved to be useful in finance and psychology but may be too restrictive in other applications. In this paper we propose a concept of local association, where highly connected components in a graphical model are positively associated and study its properties. Our main motivation comes from gene expression data, where graphical models have become a po… ▽ More

    Submitted 9 February, 2022; v1 submitted 11 August, 2020; originally announced August 2020.

    Comments: Supplementary material available at http://econ.upf.edu/~piotr/supps/2020-LZ-golazo.zip

    MSC Class: 62H05; 62H12; 62H22

  10. Estimating linear covariance models with numerical nonlinear algebra

    Authors: Bernd Sturmfels, Sascha Timme, Piotr Zwiernik

    Abstract: Numerical nonlinear algebra is applied to maximum likelihood estimation for Gaussian models defined by linear constraints on the covariance matrix. We examine the generic case as well as special models (e.g. Toeplitz, sparse, trees) that are of interest in statistics. We study the maximum likelihood degree and its dual analogue, and we introduce a new software package LinearCovarianceModels.jl for… ▽ More

    Submitted 2 September, 2019; originally announced September 2019.

    Comments: 23 pages, 2 figures, 5 tables

    Journal ref: Alg. Stat. 11 (2020) 31-52

  11. arXiv:1906.09501  [pdf, other

    math.ST cs.DS cs.LG stat.ML

    Learning partial correlation graphs and graphical models by covariance queries

    Authors: Gábor Lugosi, Jakub Truszkowski, Vasiliki Velona, Piotr Zwiernik

    Abstract: We study the problem of recovering the structure underlying large Gaussian graphical models or, more generally, partial correlation graphs. In high-dimensional problems it is often too costly to store the entire sample covariance matrix. We propose a new input model in which one can query single entries of the covariance matrix. We prove that it is possible to recover the support of the inverse co… ▽ More

    Submitted 12 October, 2021; v1 submitted 22 June, 2019; originally announced June 2019.

    Comments: The title of the paper was changed. The previous title was 'Structure learning in graphical models by covariance queries'. Other minor changes suggested by referees were also implemented

    MSC Class: 62H99

    Journal ref: Journal of Machine Learning Research, Year 2021, Volume 22, Issue 203, Pages 1-41

  12. arXiv:1905.00516  [pdf, other

    stat.ME math.ST

    Total positivity in exponential families with application to binary variables

    Authors: Steffen Lauritzen, Caroline Uhler, Piotr Zwiernik

    Abstract: We study exponential families of distributions that are multivariate totally positive of order 2 (MTP2), show that these are convex exponential families, and derive conditions for existence of the MLE. Quadratic exponential familes of MTP2 distributions contain attractive Gaussian graphical models and ferromagnetic Ising models as special examples. We show that these are defined by intersecting th… ▽ More

    Submitted 26 July, 2020; v1 submitted 1 May, 2019; originally announced May 2019.

    MSC Class: 60E15; 62H99; 15B48

    Journal ref: Annals of Statistics 2021, Vol. 49, 1436-1459

  13. arXiv:1708.00847  [pdf, other

    math.ST stat.ML

    Latent tree models

    Authors: Piotr Zwiernik

    Abstract: Latent tree models are graphical models defined on trees, in which only a subset of variables is observed. They were first discussed by Judea Pearl as tree-decomposable distributions to generalise star-decomposable distributions such as the latent class model. Latent tree models, or their submodels, are widely used in: phylogenetic analysis, network tomography, computer vision, causal modeling, an… ▽ More

    Submitted 2 August, 2017; originally announced August 2017.

    MSC Class: 62-02

  14. arXiv:1702.04031  [pdf, other

    stat.ME math.ST

    Maximum likelihood estimation in Gaussian models under total positivity

    Authors: Steffen Lauritzen, Caroline Uhler, Piotr Zwiernik

    Abstract: We analyze the problem of maximum likelihood estimation for Gaussian distributions that are multivariate totally positive of order two (MTP2). By exploiting connections to phylogenetics and single-linkage clustering, we give a simple proof that the maximum likelihood estimator (MLE) for such distributions exists based on at least 2 observations, irrespective of the underlying dimension. Slawski an… ▽ More

    Submitted 26 May, 2018; v1 submitted 13 February, 2017; originally announced February 2017.

    MSC Class: 60E15; 62H99; 15B48

  15. arXiv:1508.00436  [pdf, other

    stat.ME

    The correlation space of Gaussian latent tree models and model selection without fitting

    Authors: Nathaniel Shiers, Piotr Zwiernik, John A. D. Aston, Jim Q. Smith

    Abstract: We provide a complete description of possible covariance matrices consistent with a Gaussian latent tree model for any tree. We then present techniques for utilising these constraints to assess whether observed data is compatible with that Gaussian latent tree model. Our method does not require us first to fit such a tree. We demonstrate the usefulness of the inverse-Wishart distribution for perfo… ▽ More

    Submitted 11 April, 2016; v1 submitted 3 August, 2015; originally announced August 2015.

    Comments: 15 pages

  16. arXiv:1412.8285  [pdf, other

    stat.ME math.ST stat.ML

    Marginal likelihood and model selection for Gaussian latent tree and forest models

    Authors: Mathias Drton, Shaowei Lin, Luca Weihs, Piotr Zwiernik

    Abstract: Gaussian latent tree models, or more generally, Gaussian latent forest models have Fisher-information matrices that become singular along interesting submodels, namely, models that correspond to subforests. For these singularities, we compute the real log-canonical thresholds (also known as stochastic complexities or learning coefficients) that quantify the large-sample behavior of the marginal li… ▽ More

    Submitted 22 December, 2015; v1 submitted 29 December, 2014; originally announced December 2014.

  17. arXiv:1311.5655  [pdf, other

    stat.ME

    Binary distributions of concentric rings

    Authors: N. Wermuth G. M. Marchetti P. Zwiernik

    Abstract: We introduce families of jointly symmetric, binary distributions that are generated over directed star graphs whose nodes represent variables and whose edges indicate positive dependences. The families are parametrized in terms of a single parameter. It is an outstanding feature of these distributions that joint probabilities relate to evenly-spaced concentric rings. Kronecker product characteriza… ▽ More

    Submitted 29 July, 2014; v1 submitted 22 November, 2013; originally announced November 2013.

    Comments: 12 pages, 1 figure, 8 tables

    Journal ref: Journal of Multivariate Analysis 130 (2014) 252--260

  18. arXiv:1208.3553  [pdf, other

    stat.ME math.ST

    The Dependence of Routine Bayesian Model Selection Methods on Irrelevant Alternatives

    Authors: Piotr Zwiernik, Jim Q. Smith

    Abstract: Bayesian methods - either based on Bayes Factors or BIC - are now widely used for model selection. One property that might reasonably be demanded of any model selection method is that if a model ${M}_{1}$ is preferred to a model ${M}_{0}$, when these two models are expressed as members of one model class $\mathbb{M}$, this preference is preserved when they are embedded in a different class… ▽ More

    Submitted 17 August, 2012; originally announced August 2012.

    MSC Class: 62F15; 62F35