-
New bounds on the modularity of $G(n,p)$
Authors:
Katarzyna Rybarczyk,
Małgorzata Sulkowska
Abstract:
Modularity is a parameter indicating the presence of community structure in the graph. Nowadays it lies at the core of widely used clustering algorithms. We study the modularity of the most classical random graph, binomial $G(n,p)$. In 2020 McDiarmid and Skerman proved, taking advantage of the spectral graph theory and a specific subgraph construction by Coja-Oghlan from 2007, that there exists a…
▽ More
Modularity is a parameter indicating the presence of community structure in the graph. Nowadays it lies at the core of widely used clustering algorithms. We study the modularity of the most classical random graph, binomial $G(n,p)$. In 2020 McDiarmid and Skerman proved, taking advantage of the spectral graph theory and a specific subgraph construction by Coja-Oghlan from 2007, that there exists a constant $b$ such that with high probability the modularity of $G(n,p)$ is at most $b/\sqrt{np}$. The obtained constant $b$ is very big and not easily computable. We improve upon this result showing that a constant under $3$ may be derived here. Interesting is the fact that it might be obtained by basic probabilistic tools. We also address the lower bound on the modularity of $G(n,p)$ and improve the results of McDiarmid and Skerman from 2020 using estimates of bisections of random graphs derived by Dembo, Montanari, and Sen in 2017.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
Modularity of preferential attachment graphs
Authors:
Katarzyna Rybarczyk,
Małgorzata Sulkowska
Abstract:
We study the preferential attachment model $G_n^h$. A graph $G_n^h$ is generated from a finite initial graph by adding new vertices one at a time. Each new vertex connects to $h\ge 1$ already existing vertices, and these are chosen with probability proportional to their current degrees. We are particularly interested in the community structure of $G_n^h$, which is expressed in terms of the so-call…
▽ More
We study the preferential attachment model $G_n^h$. A graph $G_n^h$ is generated from a finite initial graph by adding new vertices one at a time. Each new vertex connects to $h\ge 1$ already existing vertices, and these are chosen with probability proportional to their current degrees. We are particularly interested in the community structure of $G_n^h$, which is expressed in terms of the so-called modularity. We prove that the modularity of $G_n^h$ is with high probability upper bounded by a function that tends to $0$ as $h$ tends to infinity. This resolves the conjecture of Prokhorenkova, Pralat, and Raigorodskii from 2016.
As a byproduct, we obtain novel concentration results (which are interesting in their own right) for the volume and edge density parameters of vertex subsets of $G_n^h$. The key ingredient here is the definition of the function $μ$, which serves as a natural measure for vertex subsets, and is proportional to the average size of their volumes. This extends previous results on the topic by Frieze, Pralat, Pérez-Giménez, and Reiniger from 2019.
△ Less
Submitted 7 April, 2025; v1 submitted 12 January, 2025;
originally announced January 2025.
-
Running minimum in the best-choice problem
Authors:
Alexander Gnedin,
Patryk Kozieł,
Małgorzata Sulkowska
Abstract:
We consider the best-choice problem for independent (not necessarily iid) observations $X_1, \cdots, X_n$ with the aim of selecting the sample minimum. We show that in this full generality the monotone case of optimal stopping holds and the stopping domain may be defined by the sequence of monotone thresholds. In the iid case we get the universal lower bounds for the success probability. We cast t…
▽ More
We consider the best-choice problem for independent (not necessarily iid) observations $X_1, \cdots, X_n$ with the aim of selecting the sample minimum. We show that in this full generality the monotone case of optimal stopping holds and the stopping domain may be defined by the sequence of monotone thresholds. In the iid case we get the universal lower bounds for the success probability. We cast the general problem with independent observations as a variational first-passage problem for the running minimum process which simplifies obtaining the formula for success probability. We illustrate this approach by revisiting the full-information game (where $X_j$'s are iid uniform-$[0,1]$), in particular deriving new representations for the success probability and its limit by $n \rightarrow \infty$. Two explicitly solvable models with discrete $X_j$'s are presented: in the first the distribution is uniform on $\{j,\cdots,n\}$, and in the second the distribution is uniform on $\{1,\cdots, n\}$. These examples are chosen to contrast two situations where the ties vanish or persist in the large-$n$ Poisson limit.
△ Less
Submitted 12 October, 2021; v1 submitted 11 October, 2021;
originally announced October 2021.
-
Preferential attachment hypergraph with high modularity
Authors:
Frédéric Giroire,
Nicolas Nisse,
Thibaud Trolliet,
Małgorzata Sulkowska
Abstract:
Numerous works have been proposed to generate random graphs preserving the same properties as real-life large scale networks. However, many real networks are better represented by hypergraphs. Few models for generating random hypergraphs exist and no general model allows to both preserve a power-law degree distribution and a high modularity indicating the presence of communities. We present a dyna…
▽ More
Numerous works have been proposed to generate random graphs preserving the same properties as real-life large scale networks. However, many real networks are better represented by hypergraphs. Few models for generating random hypergraphs exist and no general model allows to both preserve a power-law degree distribution and a high modularity indicating the presence of communities. We present a dynamic preferential attachment hypergraph model which features partition into communities. We prove that its degree distribution follows a power-law and we give theoretical lower bounds for its modularity. We compare its characteristics with a real-life co-authorship network and show that our model achieves good performances. We believe that our hypergraph model will be an interesting tool that may be used in many research domains in order to reflect better real-life phenomena.
△ Less
Submitted 1 March, 2021;
originally announced March 2021.
-
Modularity of minor-free graphs
Authors:
Michał Lasoń,
Małgorzata Sulkowska
Abstract:
We prove that a class of graphs with an excluded minor and with the maximum degree sublinear in the number of edges is maximally modular, that is, modularity tends to 1 as the number of edges tends to infinity.
We prove that a class of graphs with an excluded minor and with the maximum degree sublinear in the number of edges is maximally modular, that is, modularity tends to 1 as the number of edges tends to infinity.
△ Less
Submitted 14 February, 2021;
originally announced February 2021.
-
Counting embeddings of rooted trees into families of rooted trees
Authors:
Bernhard Gittenberger,
Zbigniew Gołębiewski,
Isabella Larcher,
Małgorzata Sulkowska
Abstract:
The number of embeddings of a partially ordered set $S$ in a partially ordered set $T$ is the number of subposets of $T$ isomorphic to $S$. If both, $S$ and $T$, have only one unique maximal element, we define good embeddings as those in which the maximal elements of $S$ and $T$ overlap. We investigate the number of good and all embeddings of a rooted poset $S$ in the family of all binary trees on…
▽ More
The number of embeddings of a partially ordered set $S$ in a partially ordered set $T$ is the number of subposets of $T$ isomorphic to $S$. If both, $S$ and $T$, have only one unique maximal element, we define good embeddings as those in which the maximal elements of $S$ and $T$ overlap. We investigate the number of good and all embeddings of a rooted poset $S$ in the family of all binary trees on $n$ elements considering two cases: plane (when the order of descendants matters) and non-plane. Furthermore, we study the number of embeddings of a rooted poset $S$ in the family of all planted plane trees of size $n$. We derive the asymptotic behaviour of good and all embeddings in all cases and we prove that the ratio of good embeddings to all is of the order $Θ(1/\sqrt{n})$ in all cases, where we provide the exact constants. Furthermore, we show that this ratio is non-decreasing with $S$ in the plane binary case and asymptotically non-decreasing with $S$ in the non-plane binary case and in the planted plane case. Finally, we comment on the case when $S$ is disconnected.
△ Less
Submitted 29 September, 2021; v1 submitted 19 August, 2020;
originally announced August 2020.
-
Maximizing the expected number of components in an online search of a graph
Authors:
Fabrício Siqueira Benevides,
Małgorzata Sulkowska
Abstract:
The following optimal stopping problem is considered. The vertices of a graph $G$ are revealed one by one, in a random order, to a selector. He aims to stop this process at a time $t$ that maximizes the expected number of connected components in the graph $\tilde{G}_t$, induced by the currently revealed vertices. The selector knows $G$ in advance, but different versions of the game are considered…
▽ More
The following optimal stopping problem is considered. The vertices of a graph $G$ are revealed one by one, in a random order, to a selector. He aims to stop this process at a time $t$ that maximizes the expected number of connected components in the graph $\tilde{G}_t$, induced by the currently revealed vertices. The selector knows $G$ in advance, but different versions of the game are considered depending on the information that he gets about $\tilde{G}_t$. We show that when $G$ has $N$ vertices and maximum degree of order $o(\sqrt{N})$, then the number of components of $\tilde{G}_t$ is concentrated around its mean, which implies that playing the optimal strategy the selector does not benefit much by receiving more information about $\tilde{G}_t$. Results of similar nature were previously obtained by M. Lasoń for the case where $G$ is a $k$-tree (for constant $k$). We also consider the particular cases where $G$ is a square, triangular or hexagonal lattice, showing that an optimal selector gains $cN$ components and we compute $c$ with an error less than $0.005$ in each case.
△ Less
Submitted 1 October, 2021; v1 submitted 2 April, 2020;
originally announced April 2020.
-
An Optimal Algorithm for Stopping on the Element Closest to the Center of an Interval
Authors:
Ewa M. Kubicka,
Grzegorz Kubicki,
Małgorzata Kuchta,
Małgorzata Sulkowska
Abstract:
Real numbers from the interval [0, 1] are randomly selected with uniform distribution. There are $n$ of them and they are revealed one by one. However, we do not know their values but only their relative ranks. We want to stop on recently revealed number maximizing the probability that that number is closest to $\frac{1}{2}$. We design an optimal stopping algorithm achieving our goal and prove tha…
▽ More
Real numbers from the interval [0, 1] are randomly selected with uniform distribution. There are $n$ of them and they are revealed one by one. However, we do not know their values but only their relative ranks. We want to stop on recently revealed number maximizing the probability that that number is closest to $\frac{1}{2}$. We design an optimal stopping algorithm achieving our goal and prove that its probability of success is asymptotically equivalent to $\frac{1}{\sqrt{n}}\sqrt{\frac{2}π}$.
△ Less
Submitted 29 April, 2019;
originally announced April 2019.
-
Protection numbers in simply generated trees and Pólya trees
Authors:
Bernhard Gittenberger,
Zbigniew Gołębiewski,
Isabella Larcher,
Małgorzata Sulkowska
Abstract:
We determine the limit of the expected value and the variance of the protection number of the root in simply generated trees, in Pólya trees, and in unlabelled non-plane binary trees, when the number of vertices tends to infinity. Moreover, we compute expectation and variance of the protection number of a randomly chosen vertex in all those tree classes. We obtain exact formulas as sum representat…
▽ More
We determine the limit of the expected value and the variance of the protection number of the root in simply generated trees, in Pólya trees, and in unlabelled non-plane binary trees, when the number of vertices tends to infinity. Moreover, we compute expectation and variance of the protection number of a randomly chosen vertex in all those tree classes. We obtain exact formulas as sum representations, where the obtained sums are rapidly converging and therefore allowing an efficient numerical computation of high accuracy. Most proofs are based on a singularity analysis of generating functions.
△ Less
Submitted 6 April, 2019;
originally announced April 2019.
-
Uniform random posets
Authors:
Patryk Kozieł,
Małgorzata Sulkowska
Abstract:
We propose a simple algorithm generating labelled posets of given size according to the almost uniform distribution. By "almost uniform" we understand that the distribution of generated posets converges in total variation to the uniform distribution. Our method is based on a Markov chain generating directed acyclic graphs.
We propose a simple algorithm generating labelled posets of given size according to the almost uniform distribution. By "almost uniform" we understand that the distribution of generated posets converges in total variation to the uniform distribution. Our method is based on a Markov chain generating directed acyclic graphs.
△ Less
Submitted 12 October, 2018;
originally announced October 2018.
-
From directed path to linear order - the best choice problem for powers of directed path
Authors:
Andrzej Grzesik,
Michał Morayne,
Małgorzata Sulkowska
Abstract:
We examine the evolution of the best choice algorithm and the probability of its success from a directed path to the linear order of the same cardinality through $k$th powers of a directed path, $1 \leq k < n$. The vertices of a $k$th power of a directed path of a known length $n$ are exposed one by one to a selector in some random order. At any time the selector can see the graph induced by the v…
▽ More
We examine the evolution of the best choice algorithm and the probability of its success from a directed path to the linear order of the same cardinality through $k$th powers of a directed path, $1 \leq k < n$. The vertices of a $k$th power of a directed path of a known length $n$ are exposed one by one to a selector in some random order. At any time the selector can see the graph induced by the vertices that have already come. The selector's aim is to choose online the maximal vertex (i.e. the vertex with no outgoing edges). It is shown that the probability of success $p_n$ for the optimal algorithm for the $k$th power of a directed path satisfies $p_n = Θ(n^{-1/(k+1)})$. We also consider the case when the selector knows the distance in the underlying path between each two vertices that are joined by an edge in the induced graph. An optimal algorithm for this choice problem is presented. The exact probability of success when using this algorithm is given.
△ Less
Submitted 12 August, 2013;
originally announced August 2013.