Monte Carlo goodness-of-fit tests for degree corrected and related stochastic blockmodels
Authors:
Vishesh Karwa,
Debdeep Pati,
Sonja Petrović,
Liam Solus,
Nikita Alexeev,
Mateja Raič,
Dane Wilburne,
Robert Williams,
Bowei Yan
Abstract:
We construct Bayesian and frequentist finite-sample goodness-of-fit tests for three different variants of the stochastic blockmodel for network data. Since all of the stochastic blockmodel variants are log-linear in form when block assignments are known, the tests for the \emph{latent} block model versions combine a block membership estimator with the algebraic statistics machinery for testing goo…
▽ More
We construct Bayesian and frequentist finite-sample goodness-of-fit tests for three different variants of the stochastic blockmodel for network data. Since all of the stochastic blockmodel variants are log-linear in form when block assignments are known, the tests for the \emph{latent} block model versions combine a block membership estimator with the algebraic statistics machinery for testing goodness-of-fit in log-linear models. We describe Markov bases and marginal polytopes of the variants of the stochastic blockmodel, and discuss how both facilitate the development of goodness-of-fit tests and understanding of model behavior.
The general testing methodology developed here extends to any finite mixture of log-linear models on discrete data, and as such is the first application of the algebraic statistics machinery for latent-variable models.
△ Less
Submitted 6 March, 2024; v1 submitted 18 December, 2016;
originally announced December 2016.
Statistical models for cores decomposition of an undirected random graph
Authors:
Vishesh Karwa,
Michael J. Pelsmajer,
Sonja Petrović,
Despina Stasi,
Dane Wilburne
Abstract:
The $k$-core decomposition is a widely studied summary statistic that describes a graph's global connectivity structure. In this paper, we move beyond using $k$-core decomposition as a tool to summarize a graph and propose using $k$-core decomposition as a tool to model random graphs. We propose using the shell distribution vector, a way of summarizing the decomposition, as a sufficient statistic…
▽ More
The $k$-core decomposition is a widely studied summary statistic that describes a graph's global connectivity structure. In this paper, we move beyond using $k$-core decomposition as a tool to summarize a graph and propose using $k$-core decomposition as a tool to model random graphs. We propose using the shell distribution vector, a way of summarizing the decomposition, as a sufficient statistic for a family of exponential random graph models. We study the properties and behavior of the model family, implement a Markov chain Monte Carlo algorithm for simulating graphs from the model, implement a direct sampler from the set of graphs with a given shell distribution, and explore the sampling distributions of some of the commonly used complementary statistics as good candidates for heuristic model fitting. These algorithms provide first fundamental steps necessary for solving the following problems: parameter estimation in this ERGM, extending the model to its Bayesian relative, and developing a rigorous methodology for testing goodness of fit of the model and model selection. The methods are applied to a synthetic network as well as the well-known Sampson monks dataset.
△ Less
Submitted 28 November, 2016; v1 submitted 27 October, 2014;
originally announced October 2014.