-
New bounds on the modularity of $G(n,p)$
Authors:
Katarzyna Rybarczyk,
Małgorzata Sulkowska
Abstract:
Modularity is a parameter indicating the presence of community structure in the graph. Nowadays it lies at the core of widely used clustering algorithms. We study the modularity of the most classical random graph, binomial $G(n,p)$. In 2020 McDiarmid and Skerman proved, taking advantage of the spectral graph theory and a specific subgraph construction by Coja-Oghlan from 2007, that there exists a…
▽ More
Modularity is a parameter indicating the presence of community structure in the graph. Nowadays it lies at the core of widely used clustering algorithms. We study the modularity of the most classical random graph, binomial $G(n,p)$. In 2020 McDiarmid and Skerman proved, taking advantage of the spectral graph theory and a specific subgraph construction by Coja-Oghlan from 2007, that there exists a constant $b$ such that with high probability the modularity of $G(n,p)$ is at most $b/\sqrt{np}$. The obtained constant $b$ is very big and not easily computable. We improve upon this result showing that a constant under $3$ may be derived here. Interesting is the fact that it might be obtained by basic probabilistic tools. We also address the lower bound on the modularity of $G(n,p)$ and improve the results of McDiarmid and Skerman from 2020 using estimates of bisections of random graphs derived by Dembo, Montanari, and Sen in 2017.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
Modularity of preferential attachment graphs
Authors:
Katarzyna Rybarczyk,
Małgorzata Sulkowska
Abstract:
We study the preferential attachment model $G_n^h$. A graph $G_n^h$ is generated from a finite initial graph by adding new vertices one at a time. Each new vertex connects to $h\ge 1$ already existing vertices, and these are chosen with probability proportional to their current degrees. We are particularly interested in the community structure of $G_n^h$, which is expressed in terms of the so-call…
▽ More
We study the preferential attachment model $G_n^h$. A graph $G_n^h$ is generated from a finite initial graph by adding new vertices one at a time. Each new vertex connects to $h\ge 1$ already existing vertices, and these are chosen with probability proportional to their current degrees. We are particularly interested in the community structure of $G_n^h$, which is expressed in terms of the so-called modularity. We prove that the modularity of $G_n^h$ is with high probability upper bounded by a function that tends to $0$ as $h$ tends to infinity. This resolves the conjecture of Prokhorenkova, Pralat, and Raigorodskii from 2016.
As a byproduct, we obtain novel concentration results (which are interesting in their own right) for the volume and edge density parameters of vertex subsets of $G_n^h$. The key ingredient here is the definition of the function $μ$, which serves as a natural measure for vertex subsets, and is proportional to the average size of their volumes. This extends previous results on the topic by Frieze, Pralat, Pérez-Giménez, and Reiniger from 2019.
△ Less
Submitted 7 April, 2025; v1 submitted 12 January, 2025;
originally announced January 2025.
-
Preferential attachment hypergraph with vertex deactivation
Authors:
Frédéric Giroire,
Nicolas Nisse,
Kostiantyn Ohulchanskyi,
Małgorzata Sulkowska,
Thibaud Trolliet
Abstract:
In the field of complex networks, hypergraph models have so far received significantly less attention than graphs. However, many real-life networks feature multiary relations (co-authorship, protein reactions) may therefore be modeled way better by hypergraphs. Also, a recent study by Broido and Clauset suggests that a power-law degree distribution is not as ubiquitous in the natural systems as it…
▽ More
In the field of complex networks, hypergraph models have so far received significantly less attention than graphs. However, many real-life networks feature multiary relations (co-authorship, protein reactions) may therefore be modeled way better by hypergraphs. Also, a recent study by Broido and Clauset suggests that a power-law degree distribution is not as ubiquitous in the natural systems as it was thought so far. They experimentally confirm that a majority of networks (56% of around 1000 networks that undergone the test) favor a power-law with an exponential cutoff over other distributions. We address the two above observations by introducing a preferential attachment hypergraph model which allows for vertex deactivations. The phenomenon of vertex deactivations is rare in existing theoretical models and omnipresent in real-life scenarios (social network accounts which are not maintained forever, collaboration networks in which people retire, technological networks in which devices break down). We prove that the degree distribution of the proposed model follows a power-law with an exponential cutoff. We also check experimentally that a Scopus collaboration network has the same characteristic. We believe that our model will predict well the behavior of systems from a variety of domains.
△ Less
Submitted 6 June, 2023; v1 submitted 29 April, 2022;
originally announced May 2022.
-
Running minimum in the best-choice problem
Authors:
Alexander Gnedin,
Patryk Kozieł,
Małgorzata Sulkowska
Abstract:
We consider the best-choice problem for independent (not necessarily iid) observations $X_1, \cdots, X_n$ with the aim of selecting the sample minimum. We show that in this full generality the monotone case of optimal stopping holds and the stopping domain may be defined by the sequence of monotone thresholds. In the iid case we get the universal lower bounds for the success probability. We cast t…
▽ More
We consider the best-choice problem for independent (not necessarily iid) observations $X_1, \cdots, X_n$ with the aim of selecting the sample minimum. We show that in this full generality the monotone case of optimal stopping holds and the stopping domain may be defined by the sequence of monotone thresholds. In the iid case we get the universal lower bounds for the success probability. We cast the general problem with independent observations as a variational first-passage problem for the running minimum process which simplifies obtaining the formula for success probability. We illustrate this approach by revisiting the full-information game (where $X_j$'s are iid uniform-$[0,1]$), in particular deriving new representations for the success probability and its limit by $n \rightarrow \infty$. Two explicitly solvable models with discrete $X_j$'s are presented: in the first the distribution is uniform on $\{j,\cdots,n\}$, and in the second the distribution is uniform on $\{1,\cdots, n\}$. These examples are chosen to contrast two situations where the ties vanish or persist in the large-$n$ Poisson limit.
△ Less
Submitted 12 October, 2021; v1 submitted 11 October, 2021;
originally announced October 2021.
-
Preferential attachment hypergraph with high modularity
Authors:
Frédéric Giroire,
Nicolas Nisse,
Thibaud Trolliet,
Małgorzata Sulkowska
Abstract:
Numerous works have been proposed to generate random graphs preserving the same properties as real-life large scale networks. However, many real networks are better represented by hypergraphs. Few models for generating random hypergraphs exist and no general model allows to both preserve a power-law degree distribution and a high modularity indicating the presence of communities. We present a dyna…
▽ More
Numerous works have been proposed to generate random graphs preserving the same properties as real-life large scale networks. However, many real networks are better represented by hypergraphs. Few models for generating random hypergraphs exist and no general model allows to both preserve a power-law degree distribution and a high modularity indicating the presence of communities. We present a dynamic preferential attachment hypergraph model which features partition into communities. We prove that its degree distribution follows a power-law and we give theoretical lower bounds for its modularity. We compare its characteristics with a real-life co-authorship network and show that our model achieves good performances. We believe that our hypergraph model will be an interesting tool that may be used in many research domains in order to reflect better real-life phenomena.
△ Less
Submitted 1 March, 2021;
originally announced March 2021.
-
Modularity of minor-free graphs
Authors:
Michał Lasoń,
Małgorzata Sulkowska
Abstract:
We prove that a class of graphs with an excluded minor and with the maximum degree sublinear in the number of edges is maximally modular, that is, modularity tends to 1 as the number of edges tends to infinity.
We prove that a class of graphs with an excluded minor and with the maximum degree sublinear in the number of edges is maximally modular, that is, modularity tends to 1 as the number of edges tends to infinity.
△ Less
Submitted 14 February, 2021;
originally announced February 2021.
-
Counting embeddings of rooted trees into families of rooted trees
Authors:
Bernhard Gittenberger,
Zbigniew Gołębiewski,
Isabella Larcher,
Małgorzata Sulkowska
Abstract:
The number of embeddings of a partially ordered set $S$ in a partially ordered set $T$ is the number of subposets of $T$ isomorphic to $S$. If both, $S$ and $T$, have only one unique maximal element, we define good embeddings as those in which the maximal elements of $S$ and $T$ overlap. We investigate the number of good and all embeddings of a rooted poset $S$ in the family of all binary trees on…
▽ More
The number of embeddings of a partially ordered set $S$ in a partially ordered set $T$ is the number of subposets of $T$ isomorphic to $S$. If both, $S$ and $T$, have only one unique maximal element, we define good embeddings as those in which the maximal elements of $S$ and $T$ overlap. We investigate the number of good and all embeddings of a rooted poset $S$ in the family of all binary trees on $n$ elements considering two cases: plane (when the order of descendants matters) and non-plane. Furthermore, we study the number of embeddings of a rooted poset $S$ in the family of all planted plane trees of size $n$. We derive the asymptotic behaviour of good and all embeddings in all cases and we prove that the ratio of good embeddings to all is of the order $Θ(1/\sqrt{n})$ in all cases, where we provide the exact constants. Furthermore, we show that this ratio is non-decreasing with $S$ in the plane binary case and asymptotically non-decreasing with $S$ in the non-plane binary case and in the planted plane case. Finally, we comment on the case when $S$ is disconnected.
△ Less
Submitted 29 September, 2021; v1 submitted 19 August, 2020;
originally announced August 2020.
-
What Do Our Choices Say About Our Preferences?
Authors:
Krzysztof Grining,
Marek Klonowski,
Małgorzata Sulkowska
Abstract:
Taking online decisions is a part of everyday life. Think of buying a house, parking a car or taking part in an auction. We often take those decisions publicly, which may breach our privacy - a party observing our choices may learn a lot about our preferences. In this paper we investigate the online stopping algorithms from the privacy preserving perspective, using a mathematically rigorous differ…
▽ More
Taking online decisions is a part of everyday life. Think of buying a house, parking a car or taking part in an auction. We often take those decisions publicly, which may breach our privacy - a party observing our choices may learn a lot about our preferences. In this paper we investigate the online stopping algorithms from the privacy preserving perspective, using a mathematically rigorous differential privacy notion. In differentially private algorithms there is usually an issue of balancing the privacy and utility. In this regime, in most cases, having both optimality and high level of privacy at the same time is impossible. We propose a natural mechanism to achieve a controllable trade-off, quantified by a parameter, between the accuracy of the online algorithm and its privacy. Depending on the parameter, our mechanism can be optimal with weaker differential privacy or suboptimal, yet more privacy-preserving. We conduct a detailed accuracy and privacy analysis of our mechanism applied to the optimal algorithm for the classical secretary problem. Thereby the classical notions from two distinct areas - optimal stopping and differential privacy - meet for the first time.
△ Less
Submitted 26 July, 2023; v1 submitted 4 May, 2020;
originally announced May 2020.
-
Maximizing the expected number of components in an online search of a graph
Authors:
Fabrício Siqueira Benevides,
Małgorzata Sulkowska
Abstract:
The following optimal stopping problem is considered. The vertices of a graph $G$ are revealed one by one, in a random order, to a selector. He aims to stop this process at a time $t$ that maximizes the expected number of connected components in the graph $\tilde{G}_t$, induced by the currently revealed vertices. The selector knows $G$ in advance, but different versions of the game are considered…
▽ More
The following optimal stopping problem is considered. The vertices of a graph $G$ are revealed one by one, in a random order, to a selector. He aims to stop this process at a time $t$ that maximizes the expected number of connected components in the graph $\tilde{G}_t$, induced by the currently revealed vertices. The selector knows $G$ in advance, but different versions of the game are considered depending on the information that he gets about $\tilde{G}_t$. We show that when $G$ has $N$ vertices and maximum degree of order $o(\sqrt{N})$, then the number of components of $\tilde{G}_t$ is concentrated around its mean, which implies that playing the optimal strategy the selector does not benefit much by receiving more information about $\tilde{G}_t$. Results of similar nature were previously obtained by M. Lasoń for the case where $G$ is a $k$-tree (for constant $k$). We also consider the particular cases where $G$ is a square, triangular or hexagonal lattice, showing that an optimal selector gains $cN$ components and we compute $c$ with an error less than $0.005$ in each case.
△ Less
Submitted 1 October, 2021; v1 submitted 2 April, 2020;
originally announced April 2020.
-
An Optimal Algorithm for Stopping on the Element Closest to the Center of an Interval
Authors:
Ewa M. Kubicka,
Grzegorz Kubicki,
Małgorzata Kuchta,
Małgorzata Sulkowska
Abstract:
Real numbers from the interval [0, 1] are randomly selected with uniform distribution. There are $n$ of them and they are revealed one by one. However, we do not know their values but only their relative ranks. We want to stop on recently revealed number maximizing the probability that that number is closest to $\frac{1}{2}$. We design an optimal stopping algorithm achieving our goal and prove tha…
▽ More
Real numbers from the interval [0, 1] are randomly selected with uniform distribution. There are $n$ of them and they are revealed one by one. However, we do not know their values but only their relative ranks. We want to stop on recently revealed number maximizing the probability that that number is closest to $\frac{1}{2}$. We design an optimal stopping algorithm achieving our goal and prove that its probability of success is asymptotically equivalent to $\frac{1}{\sqrt{n}}\sqrt{\frac{2}π}$.
△ Less
Submitted 29 April, 2019;
originally announced April 2019.
-
Protection numbers in simply generated trees and Pólya trees
Authors:
Bernhard Gittenberger,
Zbigniew Gołębiewski,
Isabella Larcher,
Małgorzata Sulkowska
Abstract:
We determine the limit of the expected value and the variance of the protection number of the root in simply generated trees, in Pólya trees, and in unlabelled non-plane binary trees, when the number of vertices tends to infinity. Moreover, we compute expectation and variance of the protection number of a randomly chosen vertex in all those tree classes. We obtain exact formulas as sum representat…
▽ More
We determine the limit of the expected value and the variance of the protection number of the root in simply generated trees, in Pólya trees, and in unlabelled non-plane binary trees, when the number of vertices tends to infinity. Moreover, we compute expectation and variance of the protection number of a randomly chosen vertex in all those tree classes. We obtain exact formulas as sum representations, where the obtained sums are rapidly converging and therefore allowing an efficient numerical computation of high accuracy. Most proofs are based on a singularity analysis of generating functions.
△ Less
Submitted 6 April, 2019;
originally announced April 2019.
-
Uniform random posets
Authors:
Patryk Kozieł,
Małgorzata Sulkowska
Abstract:
We propose a simple algorithm generating labelled posets of given size according to the almost uniform distribution. By "almost uniform" we understand that the distribution of generated posets converges in total variation to the uniform distribution. Our method is based on a Markov chain generating directed acyclic graphs.
We propose a simple algorithm generating labelled posets of given size according to the almost uniform distribution. By "almost uniform" we understand that the distribution of generated posets converges in total variation to the uniform distribution. Our method is based on a Markov chain generating directed acyclic graphs.
△ Less
Submitted 12 October, 2018;
originally announced October 2018.
-
How to Cooperate Locally to Improve Global Privacy in Social Networks? On Amplification of Privacy Preserving Data Aggregation
Authors:
Krzysztof Grining,
Marek Klonowski,
Małgorzata Sulkowska
Abstract:
In many systems privacy of users depends on the number of participants applying collectively some method to protect their security. Indeed, there are numerous already classic results about revealing aggregated data from a set of users. The conclusion is usually as follows: if you have enough friends to "aggregate" the private data, you can safely reveal your private information.
Apart from data…
▽ More
In many systems privacy of users depends on the number of participants applying collectively some method to protect their security. Indeed, there are numerous already classic results about revealing aggregated data from a set of users. The conclusion is usually as follows: if you have enough friends to "aggregate" the private data, you can safely reveal your private information.
Apart from data aggregation, it has been noticed that in a wider context privacy can be often reduced to being hidden in a crowd. Generally, the problems is how to create such crowd. This task may be not easy in some distributed systems, wherein gathering enough "individuals" is hard for practical reasons.
Such example are social networks (or similar systems), where users have only a limited number of semi trusted contacts and their aim is to reveal some aggregated data in a privacy preserving manner. This may be particularly problematic in the presence of a strong adversary that can additionally corrupt some users.
We show two methods that allow to significantly amplify privacy with only limited number of local operations and very moderate communication overhead. Except theoretical analysis we show experimental results on topologies of real-life social networks to demonstrate that our methods can significantly amplify privacy of chosen aggregation protocols even facing a massive attack of a powerful adversary.
We believe however that our results can have much wider applications for improving security of systems based on locally trusted relations.
△ Less
Submitted 26 April, 2017; v1 submitted 18 April, 2017;
originally announced April 2017.
-
From directed path to linear order - the best choice problem for powers of directed path
Authors:
Andrzej Grzesik,
Michał Morayne,
Małgorzata Sulkowska
Abstract:
We examine the evolution of the best choice algorithm and the probability of its success from a directed path to the linear order of the same cardinality through $k$th powers of a directed path, $1 \leq k < n$. The vertices of a $k$th power of a directed path of a known length $n$ are exposed one by one to a selector in some random order. At any time the selector can see the graph induced by the v…
▽ More
We examine the evolution of the best choice algorithm and the probability of its success from a directed path to the linear order of the same cardinality through $k$th powers of a directed path, $1 \leq k < n$. The vertices of a $k$th power of a directed path of a known length $n$ are exposed one by one to a selector in some random order. At any time the selector can see the graph induced by the vertices that have already come. The selector's aim is to choose online the maximal vertex (i.e. the vertex with no outgoing edges). It is shown that the probability of success $p_n$ for the optimal algorithm for the $k$th power of a directed path satisfies $p_n = Θ(n^{-1/(k+1)})$. We also consider the case when the selector knows the distance in the underlying path between each two vertices that are joined by an edge in the induced graph. An optimal algorithm for this choice problem is presented. The exact probability of success when using this algorithm is given.
△ Less
Submitted 12 August, 2013;
originally announced August 2013.