A Stochastic Block Hypergraph model
Authors:
Alexis Pister,
Marc Barthelemy
Abstract:
The stochastic block model is widely used to generate graphs with a community structure, but no simple alternative currently exists for hypergraphs, in which more than two nodes can be connected together through a hyperedge. We discuss here such a hypergraph generalization, based on the clustering connection probability $P_{ij}$ between nodes of communities $i$ and $j$, and that uses an explicit a…
▽ More
The stochastic block model is widely used to generate graphs with a community structure, but no simple alternative currently exists for hypergraphs, in which more than two nodes can be connected together through a hyperedge. We discuss here such a hypergraph generalization, based on the clustering connection probability $P_{ij}$ between nodes of communities $i$ and $j$, and that uses an explicit and modulable hyperedge formation process. We focus on the standard case where $P_{ij}=pδ_{ij}+q(1-δ_{ij})$ when $0\leq q\leq p$. We propose a simple model that satisfies three criteria: it should be as simple as possible, when $p = q$ the model should be equivalent to the standard hypergraph random model, and it should use an explicit and modulable hyperedge formation process so that the model is intuitive and can easily express different real-world formation processes. We first show that for such a model the degree distribution and hyperedge size distribution can be approximated by binomial distributions with effective parameters that depend on the number of communities and $q/p$. Also, the composition of hyperedges goes for $q=0$ from `pure' hyperedges (comprising nodes belonging to the same community) to `mixed' hyperedges that comprise nodes from different communities for $q=p$. We test various formation processes and our results suggest that when they depend on the composition of the hyperedge, they tend to favor the dominant community and lead to hyperedges with a smaller diversity. In contrast, for formation processes that are independent from the hyperedge structure, we obtain hyperedges comprising a larger diversity of communities. The advantages of the model proposed here are its simplicity and flexibility that make it a good candidate for testing community-related problems, such as their detection, impact on various dynamics, and visualization.
△ Less
Submitted 14 January, 2025; v1 submitted 19 December, 2023;
originally announced December 2023.