-
Multilevel Sampling in Algebraic Statistics
Authors:
Nathan Kirk,
Ivan Gvozdanović,
Sonja Petrović
Abstract:
This paper proposes a multilevel sampling algorithm for fiber sampling problems in algebraic statistics, inspired by Henry Wynn's suggestion to adapt multilevel Monte Carlo (MLMC) ideas to discrete models. Focusing on log-linear models, we sample from high-dimensional lattice fibers defined by algebraic constraints. Building on Markov basis methods and results from Diaconis and Sturmfels, our algo…
▽ More
This paper proposes a multilevel sampling algorithm for fiber sampling problems in algebraic statistics, inspired by Henry Wynn's suggestion to adapt multilevel Monte Carlo (MLMC) ideas to discrete models. Focusing on log-linear models, we sample from high-dimensional lattice fibers defined by algebraic constraints. Building on Markov basis methods and results from Diaconis and Sturmfels, our algorithm uses variable step sizes to accelerate exploration and reduce the need for long burn-in. We introduce a novel Fiber Coverage Score (FCS) based on Voronoi partitioning to assess sample quality, and highlight the utility of the Maximum Mean Discrepancy (MMD) quality metric. Simulations on benchmark fibers show that multilevel sampling outperforms naive MCMC approaches. Our results demonstrate that multilevel methods, when properly applied, provide practical benefits for discrete sampling in algebraic statistics.
△ Less
Submitted 6 May, 2025;
originally announced May 2025.
-
Computation of dominant ideals
Authors:
Anna Maria Bigatti,
Nursel Erey,
Selvi Kara,
Augustine O'Keefe,
Sonja Petrović,
Pierpaola Santarsiero,
Janet Striuli
Abstract:
We consider the problem of determining whether a monomial ideal is dominant. This property is critical for determining for which monomial ideals the Taylor resolution is minimal. We first analyze dominant ideals with a fixed least common multiple of generators using combinatorial methods. Then, we adopt a probabilistic approach via the \er\ type model, examining both homogeneous and non-homogeneou…
▽ More
We consider the problem of determining whether a monomial ideal is dominant. This property is critical for determining for which monomial ideals the Taylor resolution is minimal. We first analyze dominant ideals with a fixed least common multiple of generators using combinatorial methods. Then, we adopt a probabilistic approach via the \er\ type model, examining both homogeneous and non-homogeneous cases. This model offers an efficient alternative to exhaustive enumeration, allowing the study of dominance through small random samples, even in high-dimensional settings.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
The Weak Lefschetz property and unimodality of Hilbert functions of random monomial algebras
Authors:
Uwe Nagel,
Sonja Petrović
Abstract:
In this work, we investigate the presence of the weak Lefschetz property (WLP) and Hilbert functions for various types of random standard graded Artinian algebras. If an algebra has the WLP then its Hilbert function is unimodal.
Using probabilistic models for random monomial algebras, our results and simulations suggest that in each considered regime the Hilbert functions of the produced algebra…
▽ More
In this work, we investigate the presence of the weak Lefschetz property (WLP) and Hilbert functions for various types of random standard graded Artinian algebras. If an algebra has the WLP then its Hilbert function is unimodal.
Using probabilistic models for random monomial algebras, our results and simulations suggest that in each considered regime the Hilbert functions of the produced algebras are unimodal with high probability. The WLP appears to be present with high probability most of the time. However, we propose that there is one scenario where the generated algebras fail to have the WLP with high probability.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
New directions in algebraic statistics: Three challenges from 2023
Authors:
Yulia Alexandr,
Miles Bakenhus,
Mark Curiel,
Sameer K. Deshpande,
Elizabeth Gross,
Yuqi Gu,
Max Hill,
Joseph Johnson,
Bryson Kagy,
Vishesh Karwa,
Jiayi Li,
Hanbaek Lyu,
Sonja Petrović,
Jose Israel Rodriguez
Abstract:
In the last quarter of a century, algebraic statistics has established itself as an expanding field which uses multilinear algebra, commutative algebra, computational algebra, geometry, and combinatorics to tackle problems in mathematical statistics. These developments have found applications in a growing number of areas, including biology, neuroscience, economics, and social sciences.
Naturally…
▽ More
In the last quarter of a century, algebraic statistics has established itself as an expanding field which uses multilinear algebra, commutative algebra, computational algebra, geometry, and combinatorics to tackle problems in mathematical statistics. These developments have found applications in a growing number of areas, including biology, neuroscience, economics, and social sciences.
Naturally, new connections continue to be made with other areas of mathematics and statistics. This paper outlines three such connections: to statistical models used in educational testing, to a classification problem for a family of nonparametric regression models, and to phase transition phenomena under uniform sampling of contingency tables. We illustrate the motivating problems, each of which is for algebraic statistics a new direction, and demonstrate an enhancement of related methodologies.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Irreducible Markov Chains on spaces of graphs with fixed degree-color sequences
Authors:
Félix Almendra-Hernández,
Jesús A. De Loera,
Sonja Petrović
Abstract:
We study a colored generalization of the famous simple-switch Markov chain for sampling the set of graphs with a fixed degree sequence. Here we consider the space of graphs with colored vertices, in which we fix the degree sequence and another statistic arising from the vertex coloring, and prove that the set can be connected with simple color-preserving switches or moves. These moves form a basis…
▽ More
We study a colored generalization of the famous simple-switch Markov chain for sampling the set of graphs with a fixed degree sequence. Here we consider the space of graphs with colored vertices, in which we fix the degree sequence and another statistic arising from the vertex coloring, and prove that the set can be connected with simple color-preserving switches or moves. These moves form a basis for defining an irreducible Markov chain necessary for testing statistical model fit to block-partitioned network data. Our methods further generalize well-known algebraic results from the 1990s: namely, that the corresponding moves can be used to construct a regular triangulation for a generalization of the second hypersimplex. On the other hand, in contrast to the monochromatic case, we show that for simple graphs, the 1-norm of the moves necessary to connect the space increases with the number of colors.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Sampling lattice points in a polytope: a Bayesian biased algorithm with random updates
Authors:
Miles Bakenhus,
Sonja Petrović
Abstract:
The set of nonnegative integer lattice points in a polytope, also known as the fiber of a linear map, makes an appearance in several applications including optimization and statistics. We address the problem of sampling from this set using three ingredients: an easy-to-compute lattice basis of the constraint matrix, a biased sampling algorithm with a Bayesian framework, and a step-wise selection m…
▽ More
The set of nonnegative integer lattice points in a polytope, also known as the fiber of a linear map, makes an appearance in several applications including optimization and statistics. We address the problem of sampling from this set using three ingredients: an easy-to-compute lattice basis of the constraint matrix, a biased sampling algorithm with a Bayesian framework, and a step-wise selection method. The bias embedded in our algorithm updates sampler parameters to improve fiber discovery rate at each step chosen from previously discovered elements. We showcase the performance of the algorithm on several examples, including fibers that are out of reach for the state-of-the-art Markov bases samplers.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
The Spark Randomizer: a learned randomized framework for computing Gröbner bases
Authors:
Shahrzad Jamshidi,
Sonja Petrović
Abstract:
We define a violator operator which captures the definition of a minimal Gröbner basis of an ideal. This construction places the problem of computing a Gröbner basis within the framework of violator spaces, introduced in 2008 by G{ä}rtner, Matou{š}ek, R{ü}st, and {Š}kovro{ň} in a different context. The key aspect which we use is their successful utilization of a Clarkson-style fast sampling algori…
▽ More
We define a violator operator which captures the definition of a minimal Gröbner basis of an ideal. This construction places the problem of computing a Gröbner basis within the framework of violator spaces, introduced in 2008 by G{ä}rtner, Matou{š}ek, R{ü}st, and {Š}kovro{ň} in a different context. The key aspect which we use is their successful utilization of a Clarkson-style fast sampling algorithm from geometric optimization. Using the output of a machine learning algorithm, we combine the prediction of the size of a minimal Gröbner basis of an ideal with the Clarkson-style biased random sampling method to compute a Gröbner basis in expected runtime linear in the size of the violator space.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
Markov bases: a 25 year update
Authors:
Félix Almendra-Hernández,
Jesús A. De Loera,
Sonja Petrović
Abstract:
In this paper, we evaluate the challenges and best practices associated with the Markov bases approach to sampling from conditional distributions. We provide insights and clarifications after 25 years of the publication of the fundamental theorem for Markov bases by Diaconis and Sturmfels. In addition to a literature review we prove three new results on the complexity of Markov bases in hierarchic…
▽ More
In this paper, we evaluate the challenges and best practices associated with the Markov bases approach to sampling from conditional distributions. We provide insights and clarifications after 25 years of the publication of the fundamental theorem for Markov bases by Diaconis and Sturmfels. In addition to a literature review we prove three new results on the complexity of Markov bases in hierarchical models, relaxations of the fibers in log-linear models, and limitations of partial sets of moves in providing an irreducible Markov chain.
△ Less
Submitted 9 January, 2024; v1 submitted 9 June, 2023;
originally announced June 2023.
-
Operational Research: Methods and Applications
Authors:
Fotios Petropoulos,
Gilbert Laporte,
Emel Aktas,
Sibel A. Alumur,
Claudia Archetti,
Hayriye Ayhan,
Maria Battarra,
Julia A. Bennell,
Jean-Marie Bourjolly,
John E. Boylan,
Michèle Breton,
David Canca,
Laurent Charlin,
Bo Chen,
Cihan Tugrul Cicek,
Louis Anthony Cox Jr,
Christine S. M. Currie,
Erik Demeulemeester,
Li Ding,
Stephen M. Disney,
Matthias Ehrgott,
Martin J. Eppler,
Güneş Erdoğan,
Bernard Fortz,
L. Alberto Franco
, et al. (57 additional authors not shown)
Abstract:
Throughout its history, Operational Research has evolved to include a variety of methods, models and algorithms that have been applied to a diverse and wide range of contexts. This encyclopedic article consists of two main sections: methods and applications. The first aims to summarise the up-to-date knowledge and provide an overview of the state-of-the-art methods and key developments in the vari…
▽ More
Throughout its history, Operational Research has evolved to include a variety of methods, models and algorithms that have been applied to a diverse and wide range of contexts. This encyclopedic article consists of two main sections: methods and applications. The first aims to summarise the up-to-date knowledge and provide an overview of the state-of-the-art methods and key developments in the various subdomains of the field. The second offers a wide-ranging list of areas where Operational Research has been applied. The article is meant to be read in a nonlinear fashion. It should be used as a point of reference or first-port-of-call for a diverse pool of readers: academics, researchers, students, and practitioners. The entries within the methods and applications sections are presented in alphabetical order. The authors dedicate this paper to the 2023 Turkey/Syria earthquake victims. We sincerely hope that advances in OR will play a role towards minimising the pain and suffering caused by this and future catastrophes.
△ Less
Submitted 13 January, 2024; v1 submitted 24 March, 2023;
originally announced March 2023.
-
Predicting the cardinality and maximum degree of a reduced Gröbner basis
Authors:
Shahrzad Jamshidi,
Eric Kang,
Sonja Petrović
Abstract:
We construct neural network regression models to predict key metrics of complexity for Gröbner bases of binomial ideals. This work illustrates why predictions with neural networks from Gröbner computations are not a straightforward process. Using two probabilistic models for random binomial ideals, we generate and make available a large data set that is able to capture sufficient variability in Gr…
▽ More
We construct neural network regression models to predict key metrics of complexity for Gröbner bases of binomial ideals. This work illustrates why predictions with neural networks from Gröbner computations are not a straightforward process. Using two probabilistic models for random binomial ideals, we generate and make available a large data set that is able to capture sufficient variability in Gröbner complexity. We use this data to train neural networks and predict the cardinality of a reduced Gröbner basis and the maximum total degree of its elements. While the cardinality prediction problem is unlike classical problems tackled by machine learning, our simulations show that neural networks, providing performance statistics such as $r^2 = 0.401$, outperform naive guess or multiple regression models with $r^2 = 0.180$.
△ Less
Submitted 25 September, 2023; v1 submitted 10 February, 2023;
originally announced February 2023.
-
On commutators of compact operators via block tridiagonalization: generalizations and limitations of Anderson's approach
Authors:
Jireh Loreaux,
Sasmita Patnaik,
Srdjan Petrovic,
Gary Weiss
Abstract:
We offer a new perspective and some advances on the 1971 Pearcy--Topping problem: Is every compact operator a commutator of compact operators? Our goal is to analyze and generalize the 1970's work in this area of Joel Anderson combined with the work of the last named author of this paper. We reduce the general problem to a simpler sequence of finite matrix equations with norm constraints, while at…
▽ More
We offer a new perspective and some advances on the 1971 Pearcy--Topping problem: Is every compact operator a commutator of compact operators? Our goal is to analyze and generalize the 1970's work in this area of Joel Anderson combined with the work of the last named author of this paper. We reduce the general problem to a simpler sequence of finite matrix equations with norm constraints, while at the same time developing strategies for counterexamples. Our approach is to ask which compact operators $T$ are commutators $AB-BA$ of compact operators $A,B$; and to analyze the implications of Joel Anderson's contributions to this problem, which will yield a generalization of his method. By extending the techniques of Anderson [1] we obtain new classes of operators that are commutators of compact operators beyond those obtained in [17] and [2]. And by employing the techniques of the last named author [22], we found obstructions to extending Anderson's techniques in terms of certain constraints for $T$, with special focus on when $T$ is a strictly positive compact diagonal operator. Some of these constraints involve general universal block tridiagonal matrix forms for operators, and some involve $\mathcal{B(H)}$-ideal constraints. And in terms of these matrix forms, we give some equivalences, some sufficient conditions and some necessary conditions for this Pearcy--Topping problem and its various offshoots to hold true. These matrix forms are a sparsification of matrix representations of an operator (an increase in the proportion of zeros in its corners by a change of basis) and we measure the support density of these forms. And finally we provide some necessary conditions for the Pearcy--Topping problem involving singular numbers and $\mathcal{B(H)}$-ideal constraints.
△ Less
Submitted 22 March, 2022;
originally announced March 2022.
-
Marginal Independence Models
Authors:
Tobias Boege,
Sonja Petrović,
Bernd Sturmfels
Abstract:
We impose rank one constraints on marginalizations of a tensor, given by a simplicial complex. Following work of Kirkup and Sullivant, such marginal independence models can be made toric by a linear change of coordinates. We study their toric ideals, with emphasis on random graph models and independent set polytopes of matroids. We develop the numerical algebra of parameter estimation, using both…
▽ More
We impose rank one constraints on marginalizations of a tensor, given by a simplicial complex. Following work of Kirkup and Sullivant, such marginal independence models can be made toric by a linear change of coordinates. We study their toric ideals, with emphasis on random graph models and independent set polytopes of matroids. We develop the numerical algebra of parameter estimation, using both Euclidean distance and maximum likelihood, and we present a comprehensive database of small models.
△ Less
Submitted 10 May, 2022; v1 submitted 19 December, 2021;
originally announced December 2021.
-
Longitudinal Network Models and Permutation-Uniform Markov Chains
Authors:
William K. Schwartz,
Sonja Petrović,
Hemanshu Kaul
Abstract:
Consider longitudinal networks whose edges turn on and off according to a discrete-time Markov chain with exponential-family transition probabilities. We characterize when their joint distributions are also exponential families with the same parameter, improving data reduction. Further we show that the permutation-uniform subclass of these chains permit interpretation as an independent, identicall…
▽ More
Consider longitudinal networks whose edges turn on and off according to a discrete-time Markov chain with exponential-family transition probabilities. We characterize when their joint distributions are also exponential families with the same parameter, improving data reduction. Further we show that the permutation-uniform subclass of these chains permit interpretation as an independent, identically distributed sequence on the same state space. We then apply these ideas to temporal exponential random graph models, for which permutation uniformity is well suited, and discuss mean-parameter convergence, dyadic independence, and exchangeability. Our framework facilitates our introducing a new network model; simplifies analysis of some network and autoregressive models from the literature, including by permitting closed-form expressions for maximum likelihood estimates for some models; and facilitates applying standard tools to longitudinal-network Markov chains from either asymptotics or single-observation exponential random graph models.
△ Less
Submitted 10 March, 2024; v1 submitted 12 August, 2021;
originally announced August 2021.
-
Learning a performance metric of Buchberger's algorithm
Authors:
Jelena Mojsilović,
Dylan Peifer,
Sonja Petrović
Abstract:
What can be (machine) learned about the complexity of Buchberger's algorithm?
Given a system of polynomials, Buchberger's algorithm computes a Gröbner basis of the ideal these polynomials generate using an iterative procedure based on multivariate long division. The runtime of each step of the algorithm is typically dominated by a series of polynomial additions, and the total number of these add…
▽ More
What can be (machine) learned about the complexity of Buchberger's algorithm?
Given a system of polynomials, Buchberger's algorithm computes a Gröbner basis of the ideal these polynomials generate using an iterative procedure based on multivariate long division. The runtime of each step of the algorithm is typically dominated by a series of polynomial additions, and the total number of these additions is a hardware independent performance metric that is often used to evaluate and optimize various implementation choices. In this work we attempt to predict, using just the starting input, the number of polynomial additions that take place during one run of Buchberger's algorithm. Good predictions are useful for quickly estimating difficulty and understanding what features make Gröbner basis computation hard. Our features and methods could also be used for value models in the reinforcement learning approach to optimize Buchberger's algorithm introduced in [Peifer, Stillman, and Halpern-Leistner, 2020].
We show that a multiple linear regression model built from a set of easy-to-compute ideal generator statistics can predict the number of polynomial additions somewhat well, better than an uninformed model, and better than regression models built on some intuitive commutative algebra invariants that are more difficult to compute. We also train a simple recursive neural network that outperforms these linear models. Our work serves as a proof of concept, demonstrating that predicting the number of polynomial additions in Buchberger's algorithm is a feasible problem from the point of view of machine learning.
△ Less
Submitted 31 May, 2022; v1 submitted 7 June, 2021;
originally announced June 2021.
-
Goodness of fit for log-linear ERGMs
Authors:
Elizabeth Gross,
Sonja Petrović,
Despina Stasi
Abstract:
Many popular models from the networks literature can be viewed through a common lens of contingency tables on network dyads, resulting in \emph{log-linear ERGMs}: exponential family models for random graphs whose sufficient statistics are linear on the dyads. We propose a new model in this family, the \emph{$p_1$-SBM}, which combines node and group effects common in network formation mechanisms. I…
▽ More
Many popular models from the networks literature can be viewed through a common lens of contingency tables on network dyads, resulting in \emph{log-linear ERGMs}: exponential family models for random graphs whose sufficient statistics are linear on the dyads. We propose a new model in this family, the \emph{$p_1$-SBM}, which combines node and group effects common in network formation mechanisms. In particular, it is a generalization of several well-known ERGMs including the stochastic blockmodel for undirected graphs with known block assignment, the degree-corrected version of it, and the directed $p_1$ model without group structure.
We frame the problem of testing model fit for the log-linear ERGM class through an exact conditional test whose $p$-value can be approximated efficiently in networks of both small and moderately large sizes. The sampling methods we build rely on a dynamic adaptation of Markov bases. We use quick estimation algorithms adapted from the contingency table literature and effective sampling methods rooted in graph theory and algebraic statistics.
The performance and scalability of the method is demonstrated on two data sets from biology: the connectome of \emph{C. elegans} and the interactome of \emph{Arabidopsis thaliana}. These two networks -- a network and a protein-protein interaction network -- have been popular examples in the network science literature. Our work provides a model-based approach to studying them.
△ Less
Submitted 3 March, 2024; v1 submitted 7 April, 2021;
originally announced April 2021.
-
Threaded Gröbner Bases: a Macaulay2 package
Authors:
Sonja Petrović,
Shahrzad Jamshidi Zelenberg
Abstract:
The complexity of Gröbner computations has inspired many improvements to Buchberger's algorithm over the years. Looking for further insights into the algorithm's performance, we offer a threaded implementation of classical Buchberger's algorithm in {\it Macaulay2}. The output of the main function of the package includes information about {\it lineages} of non-zero remainders that are added to the…
▽ More
The complexity of Gröbner computations has inspired many improvements to Buchberger's algorithm over the years. Looking for further insights into the algorithm's performance, we offer a threaded implementation of classical Buchberger's algorithm in {\it Macaulay2}. The output of the main function of the package includes information about {\it lineages} of non-zero remainders that are added to the basis during the computation. This information can be used for further algorithm improvements and optimization.
△ Less
Submitted 13 January, 2021; v1 submitted 16 November, 2020;
originally announced November 2020.
-
Algebraic statistics, tables, and networks: The Fienberg advantage
Authors:
Elizabeth Gross,
Vishesh Karwa,
Sonja Petrović
Abstract:
Stephen Fienberg's affinity for contingency table problems and reinterpreting models with a fresh look gave rise to a new approach for hypothesis testing of network models that are linear exponential families. We outline his vision and influence in this fundamental problem, as well as generalizations to multigraphs and hypergraphs.
Stephen Fienberg's affinity for contingency table problems and reinterpreting models with a fresh look gave rise to a new approach for hypothesis testing of network models that are linear exponential families. We outline his vision and influence in this fundamental problem, as well as generalizations to multigraphs and hypergraphs.
△ Less
Submitted 3 October, 2019;
originally announced October 2019.
-
What is... a Markov basis?
Authors:
Sonja Petrović
Abstract:
This short piece defines a Markov basis. The aim is to introduce the statistical concept to mathematicians.
This short piece defines a Markov basis. The aim is to introduce the statistical concept to mathematicians.
△ Less
Submitted 15 July, 2019;
originally announced July 2019.
-
Algebraic Statistics in Practice: Applications to Networks
Authors:
Marta Casanellas,
Sonja Petrović,
Caroline Uhler
Abstract:
Algebraic statistics uses tools from algebra (especially from multilinear algebra, commutative algebra and computational algebra), geometry and combinatorics to provide insight into knotty problems in mathematical statistics. In this survey we illustrate this on three problems related to networks, namely network models for relational data, causal structure discovery and phylogenetics. For each pro…
▽ More
Algebraic statistics uses tools from algebra (especially from multilinear algebra, commutative algebra and computational algebra), geometry and combinatorics to provide insight into knotty problems in mathematical statistics. In this survey we illustrate this on three problems related to networks, namely network models for relational data, causal structure discovery and phylogenetics. For each problem we give an overview of recent results in algebraic statistics with emphasis on the statistical achievements made possible by these tools and their practical relevance for applications to other scientific disciplines.
△ Less
Submitted 22 June, 2019;
originally announced June 2019.
-
Universal Block Tridiagonalization in B(H) and Beyond
Authors:
Sasmita Patnaik,
Srdjan Petrovic,
Gary Weiss
Abstract:
For H a separable infinite dimensional complex Hilbert space, we prove that every B(H) operator has a basis with respect to which its matrix representation has a universal block tridiagonal form with block sizes given by a simple exponential formula independent of the operator. From this, such a matrix representation can be further sparsified to slightly sparser forms; it can lead to a direct sum…
▽ More
For H a separable infinite dimensional complex Hilbert space, we prove that every B(H) operator has a basis with respect to which its matrix representation has a universal block tridiagonal form with block sizes given by a simple exponential formula independent of the operator. From this, such a matrix representation can be further sparsified to slightly sparser forms; it can lead to a direct sum of even sparser forms reflecting in part some of its reducing subspace structure; and in the case of operators without invariant subspaces (if any exists), it gives a plethora of sparser block tridiagonal representations. An extension to unbounded operators occurs for a certain domain of definition condition. Moreover this process gives rise to many different choices of block sizes.
△ Less
Submitted 3 November, 2019; v1 submitted 2 May, 2019;
originally announced May 2019.
-
Random Monomial Ideals Macaulay2 Package
Authors:
Sonja Petrović,
Despina Stasi,
Dane Wilburne
Abstract:
The {\tt Macaulay2} package {\tt RandomMonomialIdeals} provides users with a set of tools that allow for the systematic generation and study of random monomial ideals. It also introduces new objects, Sample and Model, to allow for streamlined handling of random objects and their statistics in {\tt Macaulay2}.
The {\tt Macaulay2} package {\tt RandomMonomialIdeals} provides users with a set of tools that allow for the systematic generation and study of random monomial ideals. It also introduces new objects, Sample and Model, to allow for streamlined handling of random objects and their statistics in {\tt Macaulay2}.
△ Less
Submitted 25 October, 2018; v1 submitted 27 November, 2017;
originally announced November 2017.
-
Hypergraph encodings of arbitrary toric ideals
Authors:
Sonja Petrović,
Apostolos Thoma,
Marius Vladoiu
Abstract:
Relying on the combinatorial classification of toric ideals using their bouquet structure, we focus on toric ideals of hypergraphs and study how they relate to general toric ideals. We show that hypergraphs exhibit a surprisingly general behavior: the toric ideal associated to any general matrix can be encoded by that of a $0/1$ matrix, while preserving the essential combinatorics of the original…
▽ More
Relying on the combinatorial classification of toric ideals using their bouquet structure, we focus on toric ideals of hypergraphs and study how they relate to general toric ideals. We show that hypergraphs exhibit a surprisingly general behavior: the toric ideal associated to any general matrix can be encoded by that of a $0/1$ matrix, while preserving the essential combinatorics of the original ideal. We provide two universality results about the unboundedness of degrees of various generating sets: minimal, Graver, universal Gröbner bases, and indispensable binomials. Finally, we provide a polarization-type operation for arbitrary positively graded toric ideals, which preserves all the combinatorial signatures and the homological properties of the original toric ideal.
△ Less
Submitted 12 November, 2017;
originally announced November 2017.
-
The Multiple Roots Phenomenon in Maximum Likelihood Estimation for Factor Analysis
Authors:
Elizabeth Gross,
Sonja Petrović,
Donald Richards,
Despina Stasi
Abstract:
Multiple root estimation problems in statistical inference arise in many contexts in the literature. In the context of maximum likelihood estimation, the existence of multiple roots causes uncertainty in the computation of maximum likelihood estimators using hill-climbing algorithms, and consequent difficulties in the resulting statistical inference.
In this paper, we study the multiple roots ph…
▽ More
Multiple root estimation problems in statistical inference arise in many contexts in the literature. In the context of maximum likelihood estimation, the existence of multiple roots causes uncertainty in the computation of maximum likelihood estimators using hill-climbing algorithms, and consequent difficulties in the resulting statistical inference.
In this paper, we study the multiple roots phenomenon in maximum likelihood estimation for factor analysis. We prove that the corresponding likelihood equations have uncountably many feasible solutions even in the simplest cases. For the case in which the observed data are two-dimensional and the unobserved factor scores are one-dimensional, we prove that the solutions to the likelihood equations form a one-dimensional real curve.
△ Less
Submitted 15 February, 2017;
originally announced February 2017.
-
Random Monomial Ideals
Authors:
Jesus A. De Loera,
Sonja Petrovic,
Lily Silverstein,
Despina Stasi,
Dane Wilburne
Abstract:
Inspired by the study of random graphs and simplicial complexes, and motivated by the need to understand average behavior of ideals, we propose and study probabilistic models of random monomial ideals. We prove theorems about the probability distributions, expectations and thresholds for events involving monomial ideals with given Hilbert function, Krull dimension, first graded Betti numbers, and…
▽ More
Inspired by the study of random graphs and simplicial complexes, and motivated by the need to understand average behavior of ideals, we propose and study probabilistic models of random monomial ideals. We prove theorems about the probability distributions, expectations and thresholds for events involving monomial ideals with given Hilbert function, Krull dimension, first graded Betti numbers, and present several experimentally-backed conjectures about regularity, projective dimension, strong genericity, and Cohen-Macaulayness of random monomial ideals.
△ Less
Submitted 5 January, 2018; v1 submitted 24 January, 2017;
originally announced January 2017.
-
Monte Carlo goodness-of-fit tests for degree corrected and related stochastic blockmodels
Authors:
Vishesh Karwa,
Debdeep Pati,
Sonja Petrović,
Liam Solus,
Nikita Alexeev,
Mateja Raič,
Dane Wilburne,
Robert Williams,
Bowei Yan
Abstract:
We construct Bayesian and frequentist finite-sample goodness-of-fit tests for three different variants of the stochastic blockmodel for network data. Since all of the stochastic blockmodel variants are log-linear in form when block assignments are known, the tests for the \emph{latent} block model versions combine a block membership estimator with the algebraic statistics machinery for testing goo…
▽ More
We construct Bayesian and frequentist finite-sample goodness-of-fit tests for three different variants of the stochastic blockmodel for network data. Since all of the stochastic blockmodel variants are log-linear in form when block assignments are known, the tests for the \emph{latent} block model versions combine a block membership estimator with the algebraic statistics machinery for testing goodness-of-fit in log-linear models. We describe Markov bases and marginal polytopes of the variants of the stochastic blockmodel, and discuss how both facilitate the development of goodness-of-fit tests and understanding of model behavior.
The general testing methodology developed here extends to any finite mixture of log-linear models on discrete data, and as such is the first application of the algebraic statistics machinery for latent-variable models.
△ Less
Submitted 6 March, 2024; v1 submitted 18 December, 2016;
originally announced December 2016.
-
The virial theorem and ground state energy estimate of nonlinear Schrödinger equations in $\mathbb{R}^2$ with square root and saturable nonlinearities in nonlinear optics
Authors:
Tai-Chia Lin,
Milivoj R. Belic,
Milan S. Petrovic,
Hichem Hajaiej,
Goong Chen
Abstract:
The virial theorem is a nice property for the linear Schrodinger equation in atomic and molecular physics as it gives an elegant ratio between the kinetic and potential energies and is useful in assessing the quality of numerically computed eigenvalues. If the governing equation is a nonlinear Schrodinger equation with power-law nonlinearity, then a similar ratio can be obtained but there seems no…
▽ More
The virial theorem is a nice property for the linear Schrodinger equation in atomic and molecular physics as it gives an elegant ratio between the kinetic and potential energies and is useful in assessing the quality of numerically computed eigenvalues. If the governing equation is a nonlinear Schrodinger equation with power-law nonlinearity, then a similar ratio can be obtained but there seems no way of getting any eigenvalue estimate. It is surprising as far as we are concerned that when the nonlinearity is either square-root or saturable nonlinearity (not a power-law), one can develop a virial theorem and eigenvalue estimate of nonlinear Schrodinger (NLS) equations in R2 with square-root and saturable nonlinearity, respectively. Furthermore, we show here that the eigenvalue estimate can be used to obtain the 2nd order term (which is of order $lnΓ$) of the lower bound of the ground state energy as the coefficient $Γ$ of the nonlinear term tends to infinity.
△ Less
Submitted 22 August, 2016;
originally announced August 2016.
-
On the Geometry and Extremal Properties of the Edge-Degeneracy Model
Authors:
Nicolas Kim,
Dane Wilburne,
Sonja Petrović,
Alessandro Rinaldo
Abstract:
The edge-degeneracy model is an exponential random graph model that uses the graph degeneracy, a measure of the graph's connection density, and number of edges in a graph as its sufficient statistics. We show this model is relatively well-behaved by studying the statistical degeneracy of this model through the geometry of the associated polytope.
The edge-degeneracy model is an exponential random graph model that uses the graph degeneracy, a measure of the graph's connection density, and number of edges in a graph as its sufficient statistics. We show this model is relatively well-behaved by studying the statistical degeneracy of this model through the geometry of the associated polytope.
△ Less
Submitted 16 September, 2016; v1 submitted 30 January, 2016;
originally announced February 2016.
-
A survey of discrete methods in (algebraic) statistics for networks
Authors:
Sonja Petrović
Abstract:
Sampling algorithms, hypergraph degree sequences, and polytopes play a crucial role in statistical analysis of network data. This article offers a brief overview of open problems in this area of discrete mathematics from the point of view of a particular family of statistical models for networks called exponential random graph models. The problems and underlying constructions are also related to w…
▽ More
Sampling algorithms, hypergraph degree sequences, and polytopes play a crucial role in statistical analysis of network data. This article offers a brief overview of open problems in this area of discrete mathematics from the point of view of a particular family of statistical models for networks called exponential random graph models. The problems and underlying constructions are also related to well-known concepts in commutative algebra and graph-theoretic concepts in computer science. We outline a few lines of recent work that highlight the natural connection between these fields and unify them into some open problems. While these problems are often relevant in discrete mathematics in their own right, the emphasis here is on statistical relevance with the hope that these lines of research do not remain disjoint. Suggested specific open problems and general research questions should advance algebraic statistics theory as well as applied statistical tools for rigorous statistical analysis of networks.
△ Less
Submitted 8 January, 2016; v1 submitted 9 October, 2015;
originally announced October 2015.
-
Bouquet algebra of toric ideals
Authors:
Sonja Petrović,
Apostolos Thoma,
Marius Vladoiu
Abstract:
To any toric ideal $I_A$, encoded by an integer matrix $A$, we associate a matroid structure called {\em the bouquet graph} of $A$ and introduce another toric ideal called {\em the bouquet ideal} of $A$. We show how these objects capture the essential combinatorial and algebraic information about $I_A$. Passing from the toric ideal to its bouquet ideal reveals a structure that allows us to classif…
▽ More
To any toric ideal $I_A$, encoded by an integer matrix $A$, we associate a matroid structure called {\em the bouquet graph} of $A$ and introduce another toric ideal called {\em the bouquet ideal} of $A$. We show how these objects capture the essential combinatorial and algebraic information about $I_A$. Passing from the toric ideal to its bouquet ideal reveals a structure that allows us to classify several cases. For example, on the one end of the spectrum, there are ideals that we call {\em stable}, for which bouquets capture the complexity of various generating sets as well as the minimal free resolution. On the other end of the spectrum lie toric ideals whose various bases (e.g., minimal generating sets, Gröbner, Graver bases) coincide. Apart from allowing for classification-type results, bouquets provide a new way to construct families of examples of toric ideals with various interesting properties, such as robustness, genericity, and unimodularity. The new bouquet framework can be used to provide a characterization of toric ideals whose Graver basis, the universal Gröbner basis, any reduced Gröbner basis and any minimal generating set coincide.
△ Less
Submitted 7 November, 2017; v1 submitted 9 July, 2015;
originally announced July 2015.
-
Random Sampling in Computational Algebra: Helly Numbers and Violator Spaces
Authors:
Jesús A. De Loera,
Sonja Petrović,
Despina Stasi
Abstract:
This paper transfers a randomized algorithm, originally used in geometric optimization, to computational problems in commutative algebra. We show that Clarkson's sampling algorithm can be applied to two problems in computational algebra: solving large-scale polynomial systems and finding small generating sets of graded ideals. The cornerstone of our work is showing that the theory of violator spac…
▽ More
This paper transfers a randomized algorithm, originally used in geometric optimization, to computational problems in commutative algebra. We show that Clarkson's sampling algorithm can be applied to two problems in computational algebra: solving large-scale polynomial systems and finding small generating sets of graded ideals. The cornerstone of our work is showing that the theory of violator spaces of Gärtner et al.\ applies to polynomial ideal problems. To show this, one utilizes a Helly-type result for algebraic varieties. The resulting algorithms have expected runtime linear in the number of input polynomials, making the ideas interesting for handling systems with very large numbers of polynomials, but whose rank in the vector space of polynomials is small (e.g., when the number of variables and degree is constant).
△ Less
Submitted 23 December, 2015; v1 submitted 30 March, 2015;
originally announced March 2015.
-
Blow-up algebras, determinantal ideals, and Dedekind-Mertens-like formulas
Authors:
Alberto Corso,
Uwe Nagel,
Sonja Petrović,
Cornelia Yuen
Abstract:
We investigate Rees algebras and special fiber rings obtained by blowing up specialized Ferrers ideals. This class of monomial ideals includes strongly stable monomial ideals generated in degree two and edge ideals of prominent classes of graphs. We identify the equations of these blow-up algebras. They generate determinantal ideals associated to subregions of a generic symmetric matrix, which may…
▽ More
We investigate Rees algebras and special fiber rings obtained by blowing up specialized Ferrers ideals. This class of monomial ideals includes strongly stable monomial ideals generated in degree two and edge ideals of prominent classes of graphs. We identify the equations of these blow-up algebras. They generate determinantal ideals associated to subregions of a generic symmetric matrix, which may have holes. Exhibiting Gröbner bases for these ideals and using methods from Gorenstein liaison theory, we show that these determinantal rings are normal Cohen-Macaulay domains that are Koszul, that the initial ideals correspond to vertex decomposable simplicial complexes, and we determine their Hilbert functions and Castelnuovo-Mumford regularities. As a consequence, we find explicit minimal reductions for all Ferrers and many specialized Ferrers ideals, as well as their reduction numbers. These results can be viewed as extensions of the classical Dedekind-Mertens formula for the content of the product of two polynomials.
△ Less
Submitted 8 August, 2016; v1 submitted 11 February, 2015;
originally announced February 2015.
-
Statistical models for cores decomposition of an undirected random graph
Authors:
Vishesh Karwa,
Michael J. Pelsmajer,
Sonja Petrović,
Despina Stasi,
Dane Wilburne
Abstract:
The $k$-core decomposition is a widely studied summary statistic that describes a graph's global connectivity structure. In this paper, we move beyond using $k$-core decomposition as a tool to summarize a graph and propose using $k$-core decomposition as a tool to model random graphs. We propose using the shell distribution vector, a way of summarizing the decomposition, as a sufficient statistic…
▽ More
The $k$-core decomposition is a widely studied summary statistic that describes a graph's global connectivity structure. In this paper, we move beyond using $k$-core decomposition as a tool to summarize a graph and propose using $k$-core decomposition as a tool to model random graphs. We propose using the shell distribution vector, a way of summarizing the decomposition, as a sufficient statistic for a family of exponential random graph models. We study the properties and behavior of the model family, implement a Markov chain Monte Carlo algorithm for simulating graphs from the model, implement a direct sampler from the set of graphs with a given shell distribution, and explore the sampling distributions of some of the commonly used complementary statistics as good candidates for heuristic model fitting. These algorithms provide first fundamental steps necessary for solving the following problems: parameter estimation in this ERGM, extending the model to its Bayesian relative, and developing a rigorous methodology for testing goodness of fit of the model and model selection. The methods are applied to a synthetic network as well as the well-known Sampson monks dataset.
△ Less
Submitted 28 November, 2016; v1 submitted 27 October, 2014;
originally announced October 2014.
-
$β$ models for random hypergraphs with a given degree sequence
Authors:
Despina Stasi,
Kayvan Sadeghi,
Alessandro Rinaldo,
Sonja Petrović,
Stephen E. Fienberg
Abstract:
We introduce the beta model for random hypergraphs in order to represent the occurrence of multi-way interactions among agents in a social network. This model builds upon and generalizes the well-studied beta model for random graphs, which instead only considers pairwise interactions. We provide two algorithms for fitting the model parameters, IPS (iterative proportional scaling) and fixed point a…
▽ More
We introduce the beta model for random hypergraphs in order to represent the occurrence of multi-way interactions among agents in a social network. This model builds upon and generalizes the well-studied beta model for random graphs, which instead only considers pairwise interactions. We provide two algorithms for fitting the model parameters, IPS (iterative proportional scaling) and fixed point algorithm, prove that both algorithms converge if maximum likelihood estimator (MLE) exists, and provide algorithmic and geometric ways of dealing the issue of MLE existence.
△ Less
Submitted 3 July, 2014;
originally announced July 2014.
-
Goodness-of-fit for log-linear network models: Dynamic Markov bases using hypergraphs
Authors:
Elizabeth Gross,
Sonja Petrović,
Despina Stasi
Abstract:
Social networks and other large sparse data sets pose significant challenges for statistical inference, as many standard statistical methods for testing model fit are not applicable in such settings. Algebraic statistics offers a theoretically justified approach to goodness-of-fit testing that relies on the theory of Markov bases and is intimately connected with the geometry of the model as descri…
▽ More
Social networks and other large sparse data sets pose significant challenges for statistical inference, as many standard statistical methods for testing model fit are not applicable in such settings. Algebraic statistics offers a theoretically justified approach to goodness-of-fit testing that relies on the theory of Markov bases and is intimately connected with the geometry of the model as described by its fibers.
Most current practices require the computation of the entire basis, which is infeasible in many practical settings. We present a dynamic approach to explore the fiber of a model, which bypasses this issue, and is based on the combinatorics of hypergraphs arising from the toric algebra structure of log-linear models.
We demonstrate the approach on the Holland-Leinhardt $p_1$ model for random directed graphs that allows for reciprocated edges.
△ Less
Submitted 20 January, 2014;
originally announced January 2014.
-
Fibers of multi-way contingency tables given conditionals: relation to marginals, cell bounds and Markov bases
Authors:
Aleksandra B. Slavković,
Xiaotian Zhu,
Sonja Petrović
Abstract:
A reference set, or a fiber, of a contingency table is the space of all realizations of the table under a given set of constraints such as marginal totals. Understanding the geometry of this space is a key problem in algebraic statistics, important for conducting exact conditional inference, calculating cell bounds, imputing missing cell values, and assessing the risk of disclosure of sensitive in…
▽ More
A reference set, or a fiber, of a contingency table is the space of all realizations of the table under a given set of constraints such as marginal totals. Understanding the geometry of this space is a key problem in algebraic statistics, important for conducting exact conditional inference, calculating cell bounds, imputing missing cell values, and assessing the risk of disclosure of sensitive information.
Motivated primarily by disclosure limitation problems where constraints can come from summary statistics other than the margins, in this paper we study the space $\mathcal{F_T}$ of all possible multi-way contingency tables for a given sample size and set of observed conditional frequencies. We show that this space can be decomposed according to different possible marginals, which, in turn, are encoded by the solution set of a linear Diophantine equation. We characterize the difference between two fibers: $\mathcal{F_T}$ and the space of tables for a given set of corresponding marginal totals. In particular, we solve a generalization of an open problem posed by Dobra et al. (2008). Our decomposition of $\mathcal{F_T}$ has two important consequences: (1) we derive new cell bounds, some including connections to Directed Acyclic Graphs, and (2) we describe a structure for the Markov bases for the space $\mathcal{F_T}$ that leads to a simplified calculation of Markov bases in this particular setting.
△ Less
Submitted 7 January, 2014;
originally announced January 2014.
-
Graphical models in Macaulay2
Authors:
Luis David García-Puente,
Sonja Petrović,
Seth Sullivant
Abstract:
The Macaulay2 package GraphicalModels contains algorithms for the algebraic study of graphical models associated to undirected, directed and mixed graphs, and associated collections of conditional independence statements. Among the algorithms implemented are procedures for computing the vanishing ideal of graphical models, for generating conditional independence ideals of families of independence…
▽ More
The Macaulay2 package GraphicalModels contains algorithms for the algebraic study of graphical models associated to undirected, directed and mixed graphs, and associated collections of conditional independence statements. Among the algorithms implemented are procedures for computing the vanishing ideal of graphical models, for generating conditional independence ideals of families of independence statements associated to graphs, and for checking for identifiable parameters in Gaussian mixed graph models. These procedures can be used to study fundamental problems about graphical models.
△ Less
Submitted 8 January, 2013; v1 submitted 31 August, 2012;
originally announced August 2012.
-
Ground state of nonlinear Schrodinger systems with saturable nonlinearity
Authors:
Tai-Chia Lin,
Milivoj R. Belić,
Milan S. Petrović,
Goong Chen
Abstract:
We prove the existence of ground state in a multidimensional nonlinear Schrodinger model of paraxial beam propagation in isotropic local media with saturable nonlinearity. Such ground states exist in the form of bright counterpropagating solitons. From the proof, a general threshold condition on the beam coupling constant for the existence of such fundamental solitons follows.
We prove the existence of ground state in a multidimensional nonlinear Schrodinger model of paraxial beam propagation in isotropic local media with saturable nonlinearity. Such ground states exist in the form of bright counterpropagating solitons. From the proof, a general threshold condition on the beam coupling constant for the existence of such fundamental solitons follows.
△ Less
Submitted 30 August, 2012;
originally announced August 2012.
-
Combinatorial degree bound for toric ideals of hypergraphs
Authors:
Elizabeth Gross,
Sonja Petrović
Abstract:
Associated to any hypergraph is a toric ideal encoding the algebraic relations among its edges. We study these ideals and the combinatorics of their minimal generators, and derive general degree bounds for both uniform and non-uniform hypergraphs in terms of balanced hypergraph bicolorings, separators, and splitting sets. In turn, this provides complexity bounds for algebraic statistical models as…
▽ More
Associated to any hypergraph is a toric ideal encoding the algebraic relations among its edges. We study these ideals and the combinatorics of their minimal generators, and derive general degree bounds for both uniform and non-uniform hypergraphs in terms of balanced hypergraph bicolorings, separators, and splitting sets. In turn, this provides complexity bounds for algebraic statistical models associated to hypergraphs. As two main applications, we recover a well-known complexity result for Markov bases of arbitrary 3-way tables, and we show that the defining ideal of the tangential variety is generated by quadratics and cubics in cumulant coordinates.
△ Less
Submitted 21 December, 2012; v1 submitted 12 June, 2012;
originally announced June 2012.
-
Toric algebra of hypergraphs
Authors:
Sonja Petrović,
Despina Stasi
Abstract:
The edges of any hypergraph parametrize a monomial algebra called the edge subring of the hypergraph. We study presentation ideals of these edge subrings, and describe their generators in terms of balanced walks on hypergraphs. Our results generalize those for the defining ideals of edge subrings of graphs, which are well-known in the commutative algebra community, and popular in the algebraic sta…
▽ More
The edges of any hypergraph parametrize a monomial algebra called the edge subring of the hypergraph. We study presentation ideals of these edge subrings, and describe their generators in terms of balanced walks on hypergraphs. Our results generalize those for the defining ideals of edge subrings of graphs, which are well-known in the commutative algebra community, and popular in the algebraic statistics community. One of the motivations for studying toric ideals of hypergraphs comes from algebraic statistics, where generators of the toric ideal give a basis for random walks on fibers of the statistical model specified by the hypergraph. Further, understanding the structure of the generators gives insight into the model geometry.
△ Less
Submitted 3 April, 2013; v1 submitted 9 June, 2012;
originally announced June 2012.
-
Maximum likelihood degree of variance component models
Authors:
Elizabeth Gross,
Mathias Drton,
Sonja Petrović
Abstract:
Most statistical software packages implement numerical strategies for computation of maximum likelihood estimates in random effects models. Little is known, however, about the algebraic complexity of this problem. For the one-way layout with random effects and unbalanced group sizes, we give formulas for the algebraic degree of the likelihood equations as well as the equations for restricted maxim…
▽ More
Most statistical software packages implement numerical strategies for computation of maximum likelihood estimates in random effects models. Little is known, however, about the algebraic complexity of this problem. For the one-way layout with random effects and unbalanced group sizes, we give formulas for the algebraic degree of the likelihood equations as well as the equations for restricted maximum likelihood estimation. In particular, the latter approach is shown to be algebraically less complex. The formulas are obtained by studying a univariate rational equation whose solutions correspond to the solutions of the likelihood equations. Applying techniques from computational algebra, we also show that balanced two-way layouts with or without interaction have likelihood equations of degree four. Our work suggests that algebraic methods allow one to reliably find global optima of likelihood functions of linear mixed models with a small number of variance components.
△ Less
Submitted 14 November, 2011;
originally announced November 2011.
-
Maximum lilkelihood estimation in the $β$-model
Authors:
Alessandro Rinaldo,
Sonja Petrović,
Stephen E. Fienberg
Abstract:
We study maximum likelihood estimation for the statistical model for undirected random graphs, known as the $β$-model, in which the degree sequences are minimal sufficient statistics. We derive necessary and sufficient conditions, based on the polytope of degree sequences, for the existence of the maximum likelihood estimator (MLE) of the model parameters. We characterize in a combinatorial fashio…
▽ More
We study maximum likelihood estimation for the statistical model for undirected random graphs, known as the $β$-model, in which the degree sequences are minimal sufficient statistics. We derive necessary and sufficient conditions, based on the polytope of degree sequences, for the existence of the maximum likelihood estimator (MLE) of the model parameters. We characterize in a combinatorial fashion sample points leading to a nonexistent MLE, and nonestimability of the probability parameters under a nonexistent MLE. We formulate conditions that guarantee that the MLE exists with probability tending to one as the number of nodes increases.
△ Less
Submitted 18 June, 2013; v1 submitted 30 May, 2011;
originally announced May 2011.
-
PHCpack in Macaulay2
Authors:
Elizabeth Gross,
Sonja Petrović,
Jan Verschelde
Abstract:
The Macaulay2 package PHCpack.m2 provides an interface to PHCpack, a general-purpose polynomial system solver that uses homotopy continuation. The main method is a numerical blackbox solver which is implemented for all Laurent systems. The package also provides a fast mixed volume computation, the ability to filter solutions, homotopy path tracking, and a numerical irreducible decomposition method…
▽ More
The Macaulay2 package PHCpack.m2 provides an interface to PHCpack, a general-purpose polynomial system solver that uses homotopy continuation. The main method is a numerical blackbox solver which is implemented for all Laurent systems. The package also provides a fast mixed volume computation, the ability to filter solutions, homotopy path tracking, and a numerical irreducible decomposition method. As the size of many problems in applied algebraic geometry often surpasses the capabilities of symbolic software, this package will be of interest to those working on problems involving large polynomial systems.
△ Less
Submitted 10 October, 2012; v1 submitted 24 May, 2011;
originally announced May 2011.
-
On the Existence of the MLE for a Directed Random Graph Network Model with Reciprocation
Authors:
A. Rinaldo,
S. Petrović,
S. E. Fienberg
Abstract:
Holland and Leinhardt (1981) proposed a directed random graph model, the p1 model, to describe dyadic interactions in a social network. In previous work (Petrovic et al., 2010), we studied the algebraic properties of the p1 model and showed that it is a toric model specified by a multi-homogeneous ideal. We conducted an extensive study of the Markov bases for p1 that incorporate explicitly the con…
▽ More
Holland and Leinhardt (1981) proposed a directed random graph model, the p1 model, to describe dyadic interactions in a social network. In previous work (Petrovic et al., 2010), we studied the algebraic properties of the p1 model and showed that it is a toric model specified by a multi-homogeneous ideal. We conducted an extensive study of the Markov bases for p1 that incorporate explicitly the constraint arising from multi-homogeneity. Here we consider the properties of the corresponding toric variety and relate them to the conditions for the existence of the maximum likelihood and extended maximum likelihood estimators or the model parameters. Our results are directly relevant to the estimation and conditional goodness-of-fit testing problems in p1 models.
△ Less
Submitted 4 October, 2010;
originally announced October 2010.
-
Equality of Graver bases and universal Gröbner bases of colored partition identities
Authors:
Tristram Bogart,
Raymond Hemmecke,
Sonja Petrović
Abstract:
Associated to any vector configuration A is a toric ideal encoded by vectors in the kernel of A. Each toric ideal has two special generating sets: the universal Gröbner basis and the Graver basis. While the former is generally a proper subset of the latter, there are cases for which the two sets coincide. The most prominent examples among them are toric ideals of unimodular matrices. Equality of…
▽ More
Associated to any vector configuration A is a toric ideal encoded by vectors in the kernel of A. Each toric ideal has two special generating sets: the universal Gröbner basis and the Graver basis. While the former is generally a proper subset of the latter, there are cases for which the two sets coincide. The most prominent examples among them are toric ideals of unimodular matrices. Equality of universal Gröbner basis and Graver basis is a combinatorial property of the toric ideal (or, of the defining matrix), providing interesting information about ideals of higher Lawrence liftings of a matrix. Nonetheless, a general classification of all matrices for which both sets agree is far from known. We contribute to this task by identifying all cases with equality within two families of matrices; namely, those defining rational normal scrolls and those encoding homogeneous primitive colored partition identities.
△ Less
Submitted 13 April, 2010; v1 submitted 6 April, 2010;
originally announced April 2010.
-
Betti numbers of Stanley-Reisner rings determine hierarchical Markov degrees
Authors:
Sonja Petrović,
Erik Stokes
Abstract:
There are two seemingly unrelated ideals associated with a simplicial complex Δ. One is the Stanley-Reisner ideal I_Δ, the monomial ideal generated by minimal non-faces of Δ, well-known in combinatorial commutative algebra. The other is the toric ideal I_{M(Δ)} of the facet subring of Δ, whose generators give a Markov basis for the hierarchical model defined by Δ, playing a prominent role in algeb…
▽ More
There are two seemingly unrelated ideals associated with a simplicial complex Δ. One is the Stanley-Reisner ideal I_Δ, the monomial ideal generated by minimal non-faces of Δ, well-known in combinatorial commutative algebra. The other is the toric ideal I_{M(Δ)} of the facet subring of Δ, whose generators give a Markov basis for the hierarchical model defined by Δ, playing a prominent role in algebraic statistics.
In this note we show that the complexity of the generators of I_{M(Δ)} is determined by the Betti numbers of I_Δ. The unexpected connection between the syzygies of the Stanley-Reisner ideal and degrees of minimal generators of the toric ideal provide a framework for further exploration of the connection between the model and its many relatives in algebra and combinatorics.
△ Less
Submitted 31 May, 2012; v1 submitted 8 October, 2009;
originally announced October 2009.
-
Identifiability of 2-tree mixtures for group-based models
Authors:
Elizabeth S. Allman,
Sonja Petrović,
John A. Rhodes,
Seth Sullivant
Abstract:
Phylogenetic data arising on two possibly different tree topologies might be mixed through several biological mechanisms, including incomplete lineage sorting or horizontal gene transfer in the case of different topologies, or simply different substitution processes on characters in the case of the same topology. Recent work on a 2-state symmetric model of character change showed such a mixture…
▽ More
Phylogenetic data arising on two possibly different tree topologies might be mixed through several biological mechanisms, including incomplete lineage sorting or horizontal gene transfer in the case of different topologies, or simply different substitution processes on characters in the case of the same topology. Recent work on a 2-state symmetric model of character change showed such a mixture model has non-identifiable parameters, and thus it is theoretically impossible to determine the two tree topologies from any amount of data under such circumstances. Here the question of identifiability is investigated for 2-tree mixtures of the 4-state group-based models, which are more relevant to DNA sequence data. Using algebraic techniques, we show that the tree parameters are identifiable for the JC and K2P models. We also prove that generic substitution parameters for the JC mixture models are identifiable, and for the K2P and K3P models obtain generic identifiability results for mixtures on the same tree. This indicates that the full phylogenetic signal remains in such mixtures, and that the 2-state symmetric result is thus a misleading guide to the behavior of other models.
△ Less
Submitted 18 December, 2009; v1 submitted 9 September, 2009;
originally announced September 2009.
-
Algebraic statistics for a directed random graph model with reciprocation
Authors:
Sonja Petrović,
Alessandro Rinaldo,
Stephen E. Fienberg
Abstract:
The p_1 model is a directed random graph model used to describe dyadic interactions in a social network in terms of effects due to differential attraction (popularity) and expansiveness, as well as an additional effect due to reciprocation. In this article we carry out an algebraic statistics analysis of this model. We show that the p_1 model is a toric model specified by a multi-homogeneous ide…
▽ More
The p_1 model is a directed random graph model used to describe dyadic interactions in a social network in terms of effects due to differential attraction (popularity) and expansiveness, as well as an additional effect due to reciprocation. In this article we carry out an algebraic statistics analysis of this model. We show that the p_1 model is a toric model specified by a multi-homogeneous ideal. We conduct an extensive study of the Markov bases for p_1 models that incorporate explicitly the constraint arising from multi-homogeneity. Our results are directly relevant to the estimation and conditional goodness-of-fit testing problems in p_1 models.
△ Less
Submitted 16 March, 2010; v1 submitted 31 August, 2009;
originally announced September 2009.
-
Properties of cut ideals associated to ring graphs
Authors:
Uwe Nagel,
Sonja Petrović
Abstract:
A cut ideal of a graph records the relations among the cuts of the graph. These toric ideals have been introduced by Sturmfels and Sullivant who also posed the problem of relating their properties to the combinatorial structure of the graph.
We study the cut ideals of the family of ring graphs, which includes trees and cycles. We show that they have quadratic Gröbner bases and that their coord…
▽ More
A cut ideal of a graph records the relations among the cuts of the graph. These toric ideals have been introduced by Sturmfels and Sullivant who also posed the problem of relating their properties to the combinatorial structure of the graph.
We study the cut ideals of the family of ring graphs, which includes trees and cycles. We show that they have quadratic Gröbner bases and that their coordinate rings are Koszul, Hilbertian, and Cohen-Macaulay, but not Gorenstein in general.
△ Less
Submitted 27 April, 2009; v1 submitted 3 June, 2008;
originally announced June 2008.
-
On the universal Gröbner bases of varieties of minimal degree
Authors:
Sonja Petrović
Abstract:
A universal Gröbner basis of an ideal is the union of all its reduced Gröbner bases. It is contained in the Graver basis, the set of all primitive elements. Obtaining an explicit description of either of these sets, or even a sharp degree bound for their elements, is a nontrivial task.
In their '95 paper, Graham, Diaconis and Sturmfels give a nice combinatorial description of the Graver basis…
▽ More
A universal Gröbner basis of an ideal is the union of all its reduced Gröbner bases. It is contained in the Graver basis, the set of all primitive elements. Obtaining an explicit description of either of these sets, or even a sharp degree bound for their elements, is a nontrivial task.
In their '95 paper, Graham, Diaconis and Sturmfels give a nice combinatorial description of the Graver basis for any rational normal curve in terms of primitive partition identities. Their result is extended here to rational normal scrolls. The description of the Graver bases is given in terms of colored partition identities. This leads to a sharp bound on the degree of Graver basis elements, which is always attained by a circuit.
Finally, for any variety obtained from a scroll by a sequence of projections to some of the coordinate hyperplanes, the degree of any element in any reduced Gröbner basis is bounded by the degree of the variety.
△ Less
Submitted 22 November, 2007; v1 submitted 16 November, 2007;
originally announced November 2007.
-
Toric ideals of phylogenetic invariants for the general group-based model on claw trees $K_{1,n}$
Authors:
Julia Chifman,
Sonja Petrović
Abstract:
We address the problem of studying the toric ideals of phylogenetic invariants for a general group-based model on an arbitrary claw tree. We focus on the group $\mathbb Z_2$ and choose a natural recursive approach that extends to other groups. The study of the lattice associated with each phylogenetic ideal produces a list of circuits that generate the corresponding lattice ideal. In addition, w…
▽ More
We address the problem of studying the toric ideals of phylogenetic invariants for a general group-based model on an arbitrary claw tree. We focus on the group $\mathbb Z_2$ and choose a natural recursive approach that extends to other groups. The study of the lattice associated with each phylogenetic ideal produces a list of circuits that generate the corresponding lattice ideal. In addition, we describe explicitly a quadratic lexicographic Gröbner basis of the toric ideal of invariants for the claw tree on an arbitrary number of leaves. Combined with a result of Sturmfels and Sullivant, this implies that the phylogenetic ideal of every tree for the group $\mathbb Z_2$ has a quadratic Gröbner basis. Hence, the coordinate ring of the toric variety is a Koszul algebra.
△ Less
Submitted 18 April, 2007; v1 submitted 13 February, 2007;
originally announced February 2007.