-
The Singular Optimality of Distributed Computation in LOCAL
Authors:
Fabien Dufoulon,
Gopal Pandurangan,
Peter Robinson,
Michele Scquizzato
Abstract:
It has been shown that one can design distributed algorithms that are (nearly) singularly optimal, meaning they simultaneously achieve optimal time and message complexity (within polylogarithmic factors), for several fundamental global problems such as broadcast, leader election, and spanning tree construction, under the $\text{KT}_0$ assumption. With this assumption, nodes have initial knowledge…
▽ More
It has been shown that one can design distributed algorithms that are (nearly) singularly optimal, meaning they simultaneously achieve optimal time and message complexity (within polylogarithmic factors), for several fundamental global problems such as broadcast, leader election, and spanning tree construction, under the $\text{KT}_0$ assumption. With this assumption, nodes have initial knowledge only of themselves, not their neighbors. In this case the time and message lower bounds are $Ω(D)$ and $Ω(m)$, respectively, where $D$ is the diameter of the network and $m$ is the number of edges, and there exist (even) deterministic algorithms that simultaneously match these bounds.
On the other hand, under the $\text{KT}_1$ assumption, whereby each node has initial knowledge of itself and the identifiers of its neighbors, the situation is not clear. For the $\text{KT}_1$ CONGEST model (where messages are of small size), King, Kutten, and Thorup (KKT) showed that one can solve several fundamental global problems (with the notable exception of BFS tree construction) such as broadcast, leader election, and spanning tree construction with $\tilde{O}(n)$ message complexity ($n$ is the network size), which can be significantly smaller than $m$. Randomization is crucial in obtaining this result. While the message complexity of the KKT result is near-optimal, its time complexity is $\tilde{O}(n)$ rounds, which is far from the standard lower bound of $Ω(D)$.
In this paper, we show that in the $\text{KT}_1$ LOCAL model (where message sizes are not restricted), singular optimality is achievable. Our main result is that all global problems, including BFS tree construction, can be solved in $\tilde{O}(D)$ rounds and $\tilde{O}(n)$ messages, where both bounds are optimal up to polylogarithmic factors. Moreover, we show that this can be achieved deterministically.
△ Less
Submitted 11 November, 2024;
originally announced November 2024.
-
Matching on the line admits no $o(\sqrt{\log n})$-competitive algorithm
Authors:
Enoch Peserico,
Michele Scquizzato
Abstract:
We present a simple proof that the competitive ratio of any randomized online matching algorithm for the line is at least $\sqrt{\log_2(n\!+\!1)}/12$ for all $n=2^i\!-\!1: i\in\mathbb{N}$.
We present a simple proof that the competitive ratio of any randomized online matching algorithm for the line is at least $\sqrt{\log_2(n\!+\!1)}/12$ for all $n=2^i\!-\!1: i\in\mathbb{N}$.
△ Less
Submitted 31 December, 2020;
originally announced December 2020.
-
Equivalence Classes and Conditional Hardness in Massively Parallel Computations
Authors:
Danupon Nanongkai,
Michele Scquizzato
Abstract:
The Massively Parallel Computation (MPC) model serves as a common abstraction of many modern large-scale data processing frameworks, and has been receiving increasingly more attention over the past few years, especially in the context of classical graph problems. So far, the only way to argue lower bounds for this model is to condition on conjectures about the hardness of some specific problems, s…
▽ More
The Massively Parallel Computation (MPC) model serves as a common abstraction of many modern large-scale data processing frameworks, and has been receiving increasingly more attention over the past few years, especially in the context of classical graph problems. So far, the only way to argue lower bounds for this model is to condition on conjectures about the hardness of some specific problems, such as graph connectivity on promise graphs that are either one cycle or two cycles, usually called the one cycle vs. two cycles problem. This is unlike the traditional arguments based on conjectures about complexity classes (e.g., $\textsf{P} \neq \textsf{NP}$), which are often more robust in the sense that refuting them would lead to groundbreaking algorithms for a whole bunch of problems.
In this paper we present connections between problems and classes of problems that allow the latter type of arguments. These connections concern the class of problems solvable in a sublogarithmic amount of rounds in the MPC model, denoted by $\textsf{MPC}(o(\log N))$, and some standard classes concerning space complexity, namely $\textsf{L}$ and $\textsf{NL}$, and suggest conjectures that are robust in the sense that refuting them would lead to many surprisingly fast new algorithms in the MPC model. We also obtain new conditional lower bounds, and prove new reductions and equivalences between problems in the MPC model.
△ Less
Submitted 7 January, 2020;
originally announced January 2020.
-
A Lower Bound Technique for Communication in BSP
Authors:
Gianfranco Bilardi,
Michele Scquizzato,
Francesco Silvestri
Abstract:
Communication is a major factor determining the performance of algorithms on current computing systems; it is therefore valuable to provide tight lower bounds on the communication complexity of computations. This paper presents a lower bound technique for the communication complexity in the bulk-synchronous parallel (BSP) model of a given class of DAG computations. The derived bound is expressed i…
▽ More
Communication is a major factor determining the performance of algorithms on current computing systems; it is therefore valuable to provide tight lower bounds on the communication complexity of computations. This paper presents a lower bound technique for the communication complexity in the bulk-synchronous parallel (BSP) model of a given class of DAG computations. The derived bound is expressed in terms of the switching potential of a DAG, that is, the number of permutations that the DAG can realize when viewed as a switching network. The proposed technique yields tight lower bounds for the fast Fourier transform (FFT), and for any sorting and permutation network. A stronger bound is also derived for the periodic balanced sorting network, by applying this technique to suitable subnetworks. Finally, we demonstrate that the switching potential captures communication requirements even in computational models different from BSP, such as the I/O model and the LPRAM.
△ Less
Submitted 25 November, 2017; v1 submitted 7 July, 2017;
originally announced July 2017.
-
A Time- and Message-Optimal Distributed Algorithm for Minimum Spanning Trees
Authors:
Gopal Pandurangan,
Peter Robinson,
Michele Scquizzato
Abstract:
This paper presents a randomized Las Vegas distributed algorithm that constructs a minimum spanning tree (MST) in weighted networks with optimal (up to polylogarithmic factors) time and message complexity. This algorithm runs in $\tilde{O}(D + \sqrt{n})$ time and exchanges $\tilde{O}(m)$ messages (both with high probability), where $n$ is the number of nodes of the network, $D$ is the diameter, an…
▽ More
This paper presents a randomized Las Vegas distributed algorithm that constructs a minimum spanning tree (MST) in weighted networks with optimal (up to polylogarithmic factors) time and message complexity. This algorithm runs in $\tilde{O}(D + \sqrt{n})$ time and exchanges $\tilde{O}(m)$ messages (both with high probability), where $n$ is the number of nodes of the network, $D$ is the diameter, and $m$ is the number of edges. This is the first distributed MST algorithm that matches \emph{simultaneously} the time lower bound of $\tildeΩ(D + \sqrt{n})$ [Elkin, SIAM J. Comput. 2006] and the message lower bound of $Ω(m)$ [Kutten et al., J.ACM 2015] (which both apply to randomized algorithms).
The prior time and message lower bounds are derived using two completely different graph constructions; the existing lower bound construction that shows one lower bound {\em does not} work for the other. To complement our algorithm, we present a new lower bound graph construction for which any distributed MST algorithm requires \emph{both} $\tildeΩ(D + \sqrt{n})$ rounds and $Ω(m)$ messages.
△ Less
Submitted 23 January, 2018; v1 submitted 22 July, 2016;
originally announced July 2016.
-
On the Distributed Complexity of Large-Scale Graph Computations
Authors:
Gopal Pandurangan,
Peter Robinson,
Michele Scquizzato
Abstract:
Motivated by the increasing need to understand the distributed algorithmic foundations of large-scale graph computations, we study some fundamental graph problems in a message-passing model for distributed computing where $k \geq 2$ machines jointly perform computations on graphs with $n$ nodes (typically, $n \gg k$). The input graph is assumed to be initially randomly partitioned among the $k$ ma…
▽ More
Motivated by the increasing need to understand the distributed algorithmic foundations of large-scale graph computations, we study some fundamental graph problems in a message-passing model for distributed computing where $k \geq 2$ machines jointly perform computations on graphs with $n$ nodes (typically, $n \gg k$). The input graph is assumed to be initially randomly partitioned among the $k$ machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication {\em rounds} of the computation.
Our main contribution is the {\em General Lower Bound Theorem}, a theorem that can be used to show non-trivial lower bounds on the round complexity of distributed large-scale data computations. The General Lower Bound Theorem is established via an information-theoretic approach that relates the round complexity to the minimal amount of information required by machines to solve the problem. Our approach is generic and this theorem can be used in a "cookbook" fashion to show distributed lower bounds in the context of several problems, including non-graph problems. We present two applications by showing (almost) tight lower bounds for the round complexity of two fundamental graph problems, namely {\em PageRank computation} and {\em triangle enumeration}. Our approach, as demonstrated in the case of PageRank, can yield tight lower bounds for problems (including, and especially, under a stochastic partition of the input) where communication complexity techniques are not obvious.
Our approach, as demonstrated in the case of triangle enumeration, can yield stronger round lower bounds as well as message-round tradeoffs compared to approaches that use communication complexity techniques.
△ Less
Submitted 26 July, 2018; v1 submitted 26 February, 2016;
originally announced February 2016.
-
Fast Distributed Algorithms for Connectivity and MST in Large Graphs
Authors:
Gopal Pandurangan,
Peter Robinson,
Michele Scquizzato
Abstract:
Motivated by the increasing need to understand the algorithmic foundations of distributed large-scale graph computations, we study a number of fundamental graph problems in a message-passing model for distributed computing where $k \geq 2$ machines jointly perform computations on graphs with $n$ nodes (typically, $n \gg k$). The input graph is assumed to be initially randomly partitioned among the…
▽ More
Motivated by the increasing need to understand the algorithmic foundations of distributed large-scale graph computations, we study a number of fundamental graph problems in a message-passing model for distributed computing where $k \geq 2$ machines jointly perform computations on graphs with $n$ nodes (typically, $n \gg k$). The input graph is assumed to be initially randomly partitioned among the $k$ machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication rounds of the computation.
Our main result is an (almost) optimal distributed randomized algorithm for graph connectivity. Our algorithm runs in $\tilde{O}(n/k^2)$ rounds ($\tilde{O}$ notation hides a $\poly\log(n)$ factor and an additive $\poly\log(n)$ term). This improves over the best previously known bound of $\tilde{O}(n/k)$ [Klauck et al., SODA 2015], and is optimal (up to a polylogarithmic factor) in view of an existing lower bound of $\tildeΩ(n/k^2)$. Our improved algorithm uses a bunch of techniques, including linear graph sketching, that prove useful in the design of efficient distributed graph algorithms. Using the connectivity algorithm as a building block, we then present fast randomized algorithms for computing minimum spanning trees, (approximate) min-cuts, and for many graph verification problems. All these algorithms take $\tilde{O}(n/k^2)$ rounds, and are optimal up to polylogarithmic factors. We also show an almost matching lower bound of $\tildeΩ(n/k^2)$ rounds for many graph verification problems by leveraging lower bounds in random-partition communication complexity.
△ Less
Submitted 6 July, 2016; v1 submitted 8 March, 2015;
originally announced March 2015.
-
Network-Oblivious Algorithms
Authors:
Gianfranco Bilardi,
Andrea Pietracaprina,
Geppino Pucci,
Michele Scquizzato,
Francesco Silvestri
Abstract:
A framework is proposed for the design and analysis of \emph{network-oblivious algorithms}, namely, algorithms that can run unchanged, yet efficiently, on a variety of machines characterized by different degrees of parallelism and communication capabilities. The framework prescribes that a network-oblivious algorithm be specified on a parallel model of computation where the only parameter is the p…
▽ More
A framework is proposed for the design and analysis of \emph{network-oblivious algorithms}, namely, algorithms that can run unchanged, yet efficiently, on a variety of machines characterized by different degrees of parallelism and communication capabilities. The framework prescribes that a network-oblivious algorithm be specified on a parallel model of computation where the only parameter is the problem's input size, and then evaluated on a model with two parameters, capturing parallelism granularity and communication latency. It is shown that, for a wide class of network-oblivious algorithms, optimality in the latter model implies optimality in the Decomposable BSP model, which is known to effectively describe a wide and significant class of parallel platforms. The proposed framework can be regarded as an attempt to port the notion of obliviousness, well established in the context of cache hierarchies, to the realm of parallel computation. Its effectiveness is illustrated by providing optimal network-oblivious algorithms for a number of key problems. Some limitations of the oblivious approach are also discussed.
△ Less
Submitted 12 April, 2014;
originally announced April 2014.
-
Distributed Algorithms for Large-Scale Graphs
Authors:
Khalid Hourani,
Hartmut Klauck,
William K. Moses Jr.,
Danupon Nanongkai,
Gopal Pandurangan,
Peter Robinson,
Michele Scquizzato
Abstract:
Motivated by the increasing need for fast processing of large-scale graphs, we study a number of fundamental graph problems in a message-passing model for distributed computing, called $k$-machine model, where we have $k$ machines that jointly perform computations on $n$-node graphs. The graph is assumed to be partitioned in a balanced fashion among the $k$ machines, a common implementation in man…
▽ More
Motivated by the increasing need for fast processing of large-scale graphs, we study a number of fundamental graph problems in a message-passing model for distributed computing, called $k$-machine model, where we have $k$ machines that jointly perform computations on $n$-node graphs. The graph is assumed to be partitioned in a balanced fashion among the $k$ machines, a common implementation in many real-world systems. Communication is point-to-point via bandwidth-constrained links, and the goal is to minimize the round complexity, i.e., the number of communication rounds required to finish a computation.
We present a generic methodology that allows to obtain efficient algorithms in the $k$-machine model using distributed algorithms for the classical CONGEST model of distributed computing. Using this methodology, we obtain algorithms for various fundamental graph problems such as connectivity, minimum spanning trees, shortest paths, maximal independent sets, and finding subgraphs, showing that many of these problems can be solved in $\tilde{O}(n/k)$ rounds; this shows that one can achieve speedup nearly linear in $k$.
To complement our upper bounds, we present lower bounds on the round complexity that quantify the fundamental limitations of solving graph problems distributively. We first show a lower bound of $Ω(n/k)$ rounds for computing a spanning tree of the input graph. This result implies the same bound for other fundamental problems such as computing a minimum spanning tree, breadth-first tree, or shortest paths tree. We also show a $\tilde Ω(n/k^2)$ lower bound for connectivity, spanning tree verification and other related problems. The latter lower bounds follow from the development and application of novel results in a random-partition variant of the classical communication complexity model.
△ Less
Submitted 8 February, 2023; v1 submitted 25 November, 2013;
originally announced November 2013.
-
Communication Lower Bounds for Distributed-Memory Computations
Authors:
Michele Scquizzato,
Francesco Silvestri
Abstract:
We give lower bounds on the communication complexity required to solve several computational problems in a distributed-memory parallel machine, namely standard matrix multiplication, stencil computations, comparison sorting, and the Fast Fourier Transform. We revisit the assumptions under which preceding results were derived and provide new lower bounds which use much weaker and appropriate hypoth…
▽ More
We give lower bounds on the communication complexity required to solve several computational problems in a distributed-memory parallel machine, namely standard matrix multiplication, stencil computations, comparison sorting, and the Fast Fourier Transform. We revisit the assumptions under which preceding results were derived and provide new lower bounds which use much weaker and appropriate hypotheses. Our bounds rely on a mild assumption on work distribution, and strengthen previous results which require either the computation to be balanced among the processors, or specific initial distributions of the input data, or an upper bound on the size of processors' local memories.
△ Less
Submitted 20 September, 2013; v1 submitted 6 July, 2013;
originally announced July 2013.