-
Directed Semi-Simplicial Learning with Applications to Brain Activity Decoding
Authors:
Manuel Lecha,
Andrea Cavallo,
Francesca Dominici,
Ran Levi,
Alessio Del Bue,
Elvin Isufi,
Pietro Morerio,
Claudio Battiloro
Abstract:
Graph Neural Networks (GNNs) excel at learning from pairwise interactions but often overlook multi-way and hierarchical relationships. Topological Deep Learning (TDL) addresses this limitation by leveraging combinatorial topological spaces. However, existing TDL models are restricted to undirected settings and fail to capture the higher-order directed patterns prevalent in many complex systems, e.…
▽ More
Graph Neural Networks (GNNs) excel at learning from pairwise interactions but often overlook multi-way and hierarchical relationships. Topological Deep Learning (TDL) addresses this limitation by leveraging combinatorial topological spaces. However, existing TDL models are restricted to undirected settings and fail to capture the higher-order directed patterns prevalent in many complex systems, e.g., brain networks, where such interactions are both abundant and functionally significant. To fill this gap, we introduce Semi-Simplicial Neural Networks (SSNs), a principled class of TDL models that operate on semi-simplicial sets -- combinatorial structures that encode directed higher-order motifs and their directional relationships. To enhance scalability, we propose Routing-SSNs, which dynamically select the most informative relations in a learnable manner. We prove that SSNs are strictly more expressive than standard graph and TDL models. We then introduce a new principled framework for brain dynamics representation learning, grounded in the ability of SSNs to provably recover topological descriptors shown to successfully characterize brain activity. Empirically, SSNs achieve state-of-the-art performance on brain dynamics classification tasks, outperforming the second-best model by up to 27%, and message passing GNNs by up to 50% in accuracy. Our results highlight the potential of principled topological models for learning from structured brain data, establishing a unique real-world case study for TDL. We also test SSNs on standard node classification and edge regression tasks, showing competitive performance. We will make the code and data publicly available.
△ Less
Submitted 27 May, 2025; v1 submitted 23 May, 2025;
originally announced May 2025.
-
Approximately Counting and Sampling Hamiltonian Motifs in Sublinear Time
Authors:
Talya Eden,
Reut Levi,
Dana Ron,
Ronitt Rubinfeld
Abstract:
Counting small subgraphs, referred to as motifs, in large graphs is a fundamental task in graph analysis, extensively studied across various contexts and computational models. In the sublinear-time regime, the relaxed problem of approximate counting has been explored within two prominent query frameworks: the standard model, which permits degree, neighbor, and pair queries, and the strictly more p…
▽ More
Counting small subgraphs, referred to as motifs, in large graphs is a fundamental task in graph analysis, extensively studied across various contexts and computational models. In the sublinear-time regime, the relaxed problem of approximate counting has been explored within two prominent query frameworks: the standard model, which permits degree, neighbor, and pair queries, and the strictly more powerful augmented model, which additionally allows for uniform edge sampling. Currently, in the standard model, (optimal) results have been established only for approximately counting edges, stars, and cliques, all of which have a radius of one. This contrasts sharply with the state of affairs in the augmented model, where algorithmic results (some of which are optimal) are known for any input motif, leading to a disparity which we term the ``scope gap" between the two models.
In this work, we make significant progress in bridging this gap. Our approach draws inspiration from recent advancements in the augmented model and utilizes a framework centered on counting by uniform sampling, thus allowing us to establish new results in the standard model and simplify on previous results.
In particular, our first, and main, contribution is a new algorithm in the standard model for approximately counting any Hamiltonian motif in sublinear time. Our second contribution is a variant of our algorithm that enables nearly uniform sampling of these motifs, a capability previously limited in the standard model to edges and cliques. Our third contribution is to introduce even simpler algorithms for stars and cliques by exploiting their radius-one property. As a result, we simplify all previously known algorithms in the standard model for stars (Gonen, Ron, Shavitt (SODA 2010)), triangles (Eden, Levi, Ron Seshadhri (FOCS 2015)) and cliques (Eden, Ron, Seshadri (STOC 2018)).
△ Less
Submitted 12 March, 2025;
originally announced March 2025.
-
Heterogeneous Treatment Effects in Panel Data
Authors:
Retsef Levi,
Elisabeth Paulson,
Georgia Perakis,
Emily Zhang
Abstract:
We address a core problem in causal inference: estimating heterogeneous treatment effects using panel data with general treatment patterns. Many existing methods either do not utilize the potential underlying structure in panel data or have limitations in the allowable treatment patterns. In this work, we propose and evaluate a new method that first partitions observations into disjoint clusters w…
▽ More
We address a core problem in causal inference: estimating heterogeneous treatment effects using panel data with general treatment patterns. Many existing methods either do not utilize the potential underlying structure in panel data or have limitations in the allowable treatment patterns. In this work, we propose and evaluate a new method that first partitions observations into disjoint clusters with similar treatment effects using a regression tree, and then leverages the (assumed) low-rank structure of the panel data to estimate the average treatment effect for each cluster. Our theoretical results establish the convergence of the resulting estimates to the true treatment effects. Computation experiments with semi-synthetic data show that our method achieves superior accuracy compared to alternative approaches, using a regression tree with no more than 40 leaves. Hence, our method provides more accurate and interpretable estimates than alternative methods.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Testing $C_k$-freeness in bounded-arboricity graphs
Authors:
Talya Eden,
Reut Levi,
Dana Ron
Abstract:
We study the problem of testing $C_k$-freeness ($k$-cycle-freeness) for fixed constant $k > 3$ in graphs with bounded arboricity (but unbounded degrees). In particular, we are interested in one-sided error algorithms, so that they must detect a copy of $C_k$ with high constant probability when the graph is $ε$-far from $C_k$-free. We next state our results for constant arboricity and constant $ε$…
▽ More
We study the problem of testing $C_k$-freeness ($k$-cycle-freeness) for fixed constant $k > 3$ in graphs with bounded arboricity (but unbounded degrees). In particular, we are interested in one-sided error algorithms, so that they must detect a copy of $C_k$ with high constant probability when the graph is $ε$-far from $C_k$-free. We next state our results for constant arboricity and constant $ε$ with a focus on the dependence on the number of graph vertices, $n$. The query complexity of all our algorithms grows polynomially with $1/ε$. (1) As opposed to the case of $k=3$, where the complexity of testing $C_3$-freeness grows with the arboricity of the graph but not with the size of the graph (Levi, ICALP 2021) this is no longer the case already for $k=4$. We show that $Ω(n^{1/4})$ queries are necessary for testing $C_4$-freeness, and that $\widetilde{O}(n^{1/4})$ are sufficient. The same bounds hold for $C_5$. (2) For every fixed $k \geq 6$, any one-sided error algorithm for testing $C_k$-freeness must perform $Ω(n^{1/3})$ queries. (3) For $k=6$ we give a testing algorithm whose query complexity is $\widetilde{O}(n^{1/2})$. (4) For any fixed $k$, the query complexity of testing $C_k$-freeness is upper bounded by ${O}(n^{1-1/\lfloor k/2\rfloor})$.
Our $Ω(n^{1/4})$ lower bound for testing $C_4$-freeness in constant arboricity graphs provides a negative answer to an open problem posed by (Goldreich, 2021).
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Distributed CONGEST Algorithm for Finding Hamiltonian Paths in Dirac Graphs and Generalizations
Authors:
Noy Biton,
Reut Levi,
Moti Medina
Abstract:
We study the problem of finding a Hamiltonian cycle under the promise that the input graph has a minimum degree of at least $n/2$, where $n$ denotes the number of vertices in the graph. The classical theorem of Dirac states that such graphs (a.k.a. Dirac graphs) are Hamiltonian, i.e., contain a Hamiltonian cycle. Moreover, finding a Hamiltonian cycle in Dirac graphs can be done in polynomial time…
▽ More
We study the problem of finding a Hamiltonian cycle under the promise that the input graph has a minimum degree of at least $n/2$, where $n$ denotes the number of vertices in the graph. The classical theorem of Dirac states that such graphs (a.k.a. Dirac graphs) are Hamiltonian, i.e., contain a Hamiltonian cycle. Moreover, finding a Hamiltonian cycle in Dirac graphs can be done in polynomial time in the classical centralized model.
This paper presents a randomized distributed CONGEST algorithm that finds w.h.p. a Hamiltonian cycle (as well as maximum matching) within $O(\log n)$ rounds under the promise that the input graph is a Dirac graph. This upper bound is in contrast to general graphs in which both the decision and search variants of Hamiltonicity require $\tildeΩ(n^2)$ rounds, as shown by Bachrach et al. [PODC'19].
In addition, we consider two generalizations of Dirac graphs: Ore graphs and Rahman-Kaykobad graphs [IPL'05]. In Ore graphs, the sum of the degrees of every pair of non-adjacent vertices is at least $n$, and in Rahman-Kaykobad graphs, the sum of the degrees of every pair of non-adjacent vertices plus their distance is at least $n+1$. We show how our algorithm for Dirac graphs can be adapted to work for these more general families of graphs.
△ Less
Submitted 20 July, 2023; v1 submitted 1 February, 2023;
originally announced February 2023.
-
Supply Chain Characteristics as Predictors of Cyber Risk: A Machine-Learning Assessment
Authors:
Kevin Hu,
Retsef Levi,
Raphael Yahalom,
El Ghali Zerhouni
Abstract:
This paper provides the first large-scale data-driven analysis to evaluate the predictive power of different attributes for assessing risk of cyberattack data breaches. Furthermore, motivated by rapid increase in third party enabled cyberattacks, the paper provides the first quantitative empirical evidence that digital supply-chain attributes are significant predictors of enterprise cyber risk. Th…
▽ More
This paper provides the first large-scale data-driven analysis to evaluate the predictive power of different attributes for assessing risk of cyberattack data breaches. Furthermore, motivated by rapid increase in third party enabled cyberattacks, the paper provides the first quantitative empirical evidence that digital supply-chain attributes are significant predictors of enterprise cyber risk. The paper leverages outside-in cyber risk scores that aim to capture the quality of the enterprise internal cybersecurity management, but augment these with supply chain features that are inspired by observed third party cyberattack scenarios, as well as concepts from network science research. The main quantitative result of the paper is to show that supply chain network features add significant detection power to predicting enterprise cyber risk, relative to merely using enterprise-only attributes. Particularly, compared to a base model that relies only on internal enterprise features, the supply chain network features improve the out-of-sample AUC by 2.3\%. Given that each cyber data breach is a low probability high impact risk event, these improvements in the prediction power have significant value. Additionally, the model highlights several cybersecurity risk drivers related to third party cyberattack and breach mechanisms and provides important insights as to what interventions might be effective to mitigate these risks.
△ Less
Submitted 13 November, 2023; v1 submitted 27 October, 2022;
originally announced October 2022.
-
Improved Local Computation Algorithms for Constructing Spanners
Authors:
Rubi Arviv,
Lily Chung,
Reut Levi,
Edward Pyne
Abstract:
A spanner of a graph is a subgraph that preserves lengths of shortest paths up to a multiplicative distortion. For every $k$, a spanner with size $O(n^{1+1/k})$ and stretch $(2k+1)$ can be constructed by a simple centralized greedy algorithm, and this is tight assuming Erdős girth conjecture.
In this paper we study the problem of constructing spanners in a local manner, specifically in the Local…
▽ More
A spanner of a graph is a subgraph that preserves lengths of shortest paths up to a multiplicative distortion. For every $k$, a spanner with size $O(n^{1+1/k})$ and stretch $(2k+1)$ can be constructed by a simple centralized greedy algorithm, and this is tight assuming Erdős girth conjecture.
In this paper we study the problem of constructing spanners in a local manner, specifically in the Local Computation Model proposed by Rubinfeld et al. (ICS 2011).
We provide a randomized Local Computation Agorithm (LCA) for constructing $(2r-1)$-spanners with $\tilde{O}(n^{1+1/r})$ edges and probe complexity of $\tilde{O}(n^{1-1/r})$ for $r \in \{2,3\}$, where $n$ denotes the number of vertices in the input graph. Up to polylogarithmic factors, in both cases, the stretch factor is optimal (for the respective number of edges). In addition, our probe complexity for $r=2$, i.e., for constructing a $3$-spanner, is optimal up to polylogarithmic factors. Our result improves over the probe complexity of Parter et al. (ITCS 2019) that is $\tilde{O}(n^{1-1/2r})$ for $r \in \{2,3\}$. Both our algorithms and the algorithms of Parter et al. use a combination of neighbor-probes and pair-probes in the above-mentioned LCAs.
For general $k\geq 1$, we provide an LCA for constructing $O(k^2)$-spanners with $\tilde{O}(n^{1+1/k})$ edges using $O(n^{2/3}Δ^2)$ neighbor-probes, improving over the $\tilde{O}(n^{2/3}Δ^4)$ algorithm of Parter et al.
By developing a new randomized LCA for graph decomposition, we further improve the probe complexity of the latter task to be $O(n^{2/3-(1.5-α)/k}Δ^2)$, for any constant $α>0$. This latter LCA may be of independent interest.
△ Less
Submitted 6 July, 2023; v1 submitted 11 May, 2021;
originally announced May 2021.
-
Testing Triangle Freeness in the General Model in Graphs with Arboricity $O(\sqrt{n})$
Authors:
Reut Levi
Abstract:
We study the problem of testing triangle freeness in the general graph model. This problem was first studied in the general graph model by Alon et al. (SIAM J. Discret. Math. 2008) who provided both lower bounds and upper bounds that depend on the number of vertices and the average degree of the graph. Their bounds are tight only when $d_{\rm max} = O(d)$ and $\bar{d} \leq \sqrt{n}$ or when…
▽ More
We study the problem of testing triangle freeness in the general graph model. This problem was first studied in the general graph model by Alon et al. (SIAM J. Discret. Math. 2008) who provided both lower bounds and upper bounds that depend on the number of vertices and the average degree of the graph. Their bounds are tight only when $d_{\rm max} = O(d)$ and $\bar{d} \leq \sqrt{n}$ or when $\bar{d} = Θ(1)$, where $d_{\rm max}$ denotes the maximum degree and $\bar{d}$ denotes the average degree of the graph. In this paper we provide bounds that depend on the arboricity of the graph and the average degree. As in Alon et al., the parameters of our tester is the number of vertices, $n$, the number of edges, $m$, and the proximity parameter $ε$ (the arboricity of the graph is not a parameter of the algorithm). The query complexity of our tester is $\tilde{O}(Γ/\bar{d} + Γ)\cdot poly(1/ε)$ on expectation, where $Γ$ denotes the arboricity of the input graph (we use $\tilde{O}(\cdot)$ to suppress $O(\log \log n)$ factors). We show that for graphs with arboricity $O(\sqrt{n})$ this upper bound is tight in the following sense. For any $Γ\in [s]$ where $s= Θ(\sqrt{n})$ there exists a family of graphs with arboricity $Γ$ and average degree $\bar{d}$ such that $Ω(Γ/\bar{d} + Γ)$ queries are required for testing triangle freeness on this family of graphs. Moreover, this lower bound holds for any such $Γ$ and for a large range of feasible average degrees.
△ Less
Submitted 11 May, 2021;
originally announced May 2021.
-
Testing Hamiltonicity (and other problems) in Minor-Free Graphs
Authors:
Reut Levi,
Nadav Shoshan
Abstract:
In this paper we provide sub-linear algorithms for several fundamental problems in the setting in which the input graph excludes a fixed minor, i.e., is a minor-free graph. In particular, we provide the following algorithms for minor-free unbounded degree graphs. (1) A tester for Hamiltonicity with two-sided error with $poly(1/ε)$-query complexity, where $ε$ is the proximity parameter. (2) A local…
▽ More
In this paper we provide sub-linear algorithms for several fundamental problems in the setting in which the input graph excludes a fixed minor, i.e., is a minor-free graph. In particular, we provide the following algorithms for minor-free unbounded degree graphs. (1) A tester for Hamiltonicity with two-sided error with $poly(1/ε)$-query complexity, where $ε$ is the proximity parameter. (2) A local algorithm, as defined by Rubinfeld et al. (ICS 2011), for constructing a spanning subgraph with almost minimum weight, specifically, at most a factor $(1+ε)$ of the optimum, with $poly(1/ε)$-query complexity. Both our algorithms use partition oracles, a tool introduced by Hassidim et al. (FOCS 2009), which are oracles that provide access to a partition of the graph such that the number of cut-edges is small and each part of the partition is small. The polynomial dependence in $1/ε$ of our algorithms is achieved by combining the recent $poly(d/ε)$-query partition oracle of Kumar-Seshadhri-Stolman (ECCC 2021) for minor-free graphs with degree bounded by $d$.
For bounded degree minor-free graphs we introduce the notion of covering partition oracles which is a relaxed version of partition oracles and design a $poly(d/ε)$-time covering partition oracle. Using our covering partition oracle we provide the same results as above (except that the tester for Hamiltonicity has one-sided error) for minor-free bounded degree graphs, as well as showing that any property which is monotone and additive (e.g. bipartiteness) can be tested in minor-free graphs by making $poly(d/ε)$-queries.
The benefit of using the covering partition oracle rather than the partition oracle in our algorithms is its simplicity and an improved polynomial dependence in $1/ε$ in the obtained query complexity.
△ Less
Submitted 11 May, 2021; v1 submitted 23 February, 2021;
originally announced February 2021.
-
The Limits to Learning a Diffusion Model
Authors:
Jackie Baek,
Vivek F. Farias,
Andreea Georgescu,
Retsef Levi,
Tianyi Peng,
Deeksha Sinha,
Joshua Wilde,
Andrew Zheng
Abstract:
This paper provides the first sample complexity lower bounds for the estimation of simple diffusion models, including the Bass model (used in modeling consumer adoption) and the SIR model (used in modeling epidemics). We show that one cannot hope to learn such models until quite late in the diffusion. Specifically, we show that the time required to collect a number of observations that exceeds our…
▽ More
This paper provides the first sample complexity lower bounds for the estimation of simple diffusion models, including the Bass model (used in modeling consumer adoption) and the SIR model (used in modeling epidemics). We show that one cannot hope to learn such models until quite late in the diffusion. Specifically, we show that the time required to collect a number of observations that exceeds our sample complexity lower bounds is large. For Bass models with low innovation rates, our results imply that one cannot hope to predict the eventual number of adopting customers until one is at least two-thirds of the way to the time at which the rate of new adopters is at its peak. In a similar vein, our results imply that in the case of an SIR model, one cannot hope to predict the eventual number of infections until one is approximately two-thirds of the way to the time at which the infection rate has peaked. This lower bound in estimation further translates into a lower bound in regret for decision-making in epidemic interventions. Our results formalize the challenge of accurate forecasting and highlight the importance of incorporating additional data sources. To this end, we analyze the benefit of a seroprevalence study in an epidemic, where we characterize the size of the study needed to improve SIR model estimation. Extensive empirical analyses on product adoption and epidemic data support our theoretical findings.
△ Less
Submitted 23 May, 2023; v1 submitted 11 June, 2020;
originally announced June 2020.
-
Distributed Testing of Graph Isomorphism in the CONGEST model
Authors:
Reut Levi,
Moti Medina
Abstract:
In this paper we study the problem of testing graph isomorphism (GI) in the CONGEST distributed model. In this setting we test whether the distributive network, $G_U$, is isomorphic to $G_K$ which is given as an input to all the nodes in the network, or alternatively, only to a single node.
We first consider the decision variant of the problem in which the algorithm distinguishes $G_U$ and…
▽ More
In this paper we study the problem of testing graph isomorphism (GI) in the CONGEST distributed model. In this setting we test whether the distributive network, $G_U$, is isomorphic to $G_K$ which is given as an input to all the nodes in the network, or alternatively, only to a single node.
We first consider the decision variant of the problem in which the algorithm distinguishes $G_U$ and $G_K$ which are isomorphic from $G_U$ and $G_K$ which are not isomorphic. We provide a randomized algorithm with $O(n)$ rounds for the setting in which $G_K$ is given only to a single node. We prove that for this setting the number of rounds of any deterministic algorithm is $\tildeΩ(n^2)$ rounds, where $n$ denotes the number of nodes, which implies a separation between the randomized and the deterministic complexities of deciding GI.
We then consider the \emph{property testing} variant of the problem, where the algorithm is only required to distinguish the case that $G_U$ and $G_K$ are isomorphic from the case that $G_U$ and $G_K$ are \emph{far} from being isomorphic (according to some predetermined distance measure). We show that every algorithm requires $Ω(D)$ rounds, where $D$ denotes the diameter of the network. This lower bound holds even if all the nodes are given $G_K$ as an input, and even if the message size is unbounded. We provide a randomized algorithm with an almost matching round complexity of $O(D+(ε^{-1}\log n)^2)$ rounds that is suitable for dense graphs.
We also show that with the same number of rounds it is possible that each node outputs its mapping according to a bijection which is an \emph{approximated} isomorphism.
We conclude with simple simulation arguments that allow us to obtain essentially tight algorithms with round complexity $\tilde{O}(D)$ for special families of sparse graphs.
△ Less
Submitted 1 March, 2020;
originally announced March 2020.
-
Property Testing of Planarity in the CONGEST model
Authors:
Reut Levi,
Moti Medina,
Dana Ron
Abstract:
We give a distributed algorithm in the {\sf CONGEST} model for property testing of planarity with one-sided error in general (unbounded-degree) graphs. Following Censor-Hillel et al. (DISC 2016), who recently initiated the study of property testing in the distributed setting, our algorithm gives the following guarantee: For a graph $G = (V,E)$ and a distance parameter $ε$, if $G$ is planar, then e…
▽ More
We give a distributed algorithm in the {\sf CONGEST} model for property testing of planarity with one-sided error in general (unbounded-degree) graphs. Following Censor-Hillel et al. (DISC 2016), who recently initiated the study of property testing in the distributed setting, our algorithm gives the following guarantee: For a graph $G = (V,E)$ and a distance parameter $ε$, if $G$ is planar, then every node outputs {\sf accept\/}, and if $G$ is $ε$-far from being planar (i.e., more than $ε\cdot |E|$ edges need to be removed in order to make $G$ planar), then with probability $1-1/{\rm poly}(n)$ at least one node outputs {\sf reject}. The algorithm runs in $O(\log|V|\cdot{\rm poly}(1/ε))$ rounds, and we show that this result is tight in terms of the dependence on $|V|$.
Our algorithm combines several techniques of graph partitioning and local verification of planar embeddings. Furthermore, we show how a main subroutine in our algorithm can be applied to derive additional results for property testing of cycle-freeness and bipartiteness, as well as the construction of spanners, in minor-free (unweighted) graphs.
△ Less
Submitted 14 August, 2019; v1 submitted 27 May, 2018;
originally announced May 2018.
-
Graph Ranking and the Cost of Sybil Defense
Authors:
Gwendolyn Farach-Colton,
Martin Farach-Colton,
Leslie Ann Goldberg,
Hanna Komlos,
John Lapinskas,
Reut Levi,
Moti Medina,
Miguel A. Mosteiro
Abstract:
Ranking functions such as PageRank assign numeric values (ranks) to nodes of graphs, most notably the web graph. Node rankings are an integral part of Internet search algorithms, since they can be used to order the results of queries. However, these ranking functions are famously subject to attacks by spammers, who modify the web graph in order to give their own pages more rank. We characterize th…
▽ More
Ranking functions such as PageRank assign numeric values (ranks) to nodes of graphs, most notably the web graph. Node rankings are an integral part of Internet search algorithms, since they can be used to order the results of queries. However, these ranking functions are famously subject to attacks by spammers, who modify the web graph in order to give their own pages more rank. We characterize the interplay between rankers and spammers as a game. We define the two critical features of this game, spam resistance and distortion, based on how spammers spam and how rankers protect against spam. We observe that all the ranking functions that are well-studied in the literature, including the original formulation of PageRank, have poor spam resistance, poor distortion, or both. Finally, we study Min-PPR, the form of PageRank used at Google itself, but which has received no (theoretical or empirical) treatment in the literature. We prove that Min-PPR has low distortion and high spam resistance. A secondary benefit is that Min-PPR comes with an explicit cost function on nodes that shows how important they are to the spammer; thus a ranker can focus their spam-detection capacity on these vulnerable nodes. Both Min-PPR and its associated cost function are straightforward to compute.
△ Less
Submitted 1 June, 2023; v1 submitted 13 March, 2018;
originally announced March 2018.
-
A Sublinear Tester for Outerplanarity (and Other Forbidden Minors) With One-Sided Error
Authors:
Hendrik Fichtenberger,
Reut Levi,
Yadu Vasudev,
Maximilian Wötzel
Abstract:
We consider one-sided error property testing of $\mathcal{F}$-minor freeness in bounded-degree graphs for any finite family of graphs $\mathcal{F}$ that contains a minor of $K_{2,k}$, the $k$-circus graph, or the $(k\times 2)$-grid for any $k\in\mathbb{N}$. This includes, for instance, testing whether a graph is outerplanar or a cactus graph. The query complexity of our algorithm in terms of the n…
▽ More
We consider one-sided error property testing of $\mathcal{F}$-minor freeness in bounded-degree graphs for any finite family of graphs $\mathcal{F}$ that contains a minor of $K_{2,k}$, the $k$-circus graph, or the $(k\times 2)$-grid for any $k\in\mathbb{N}$. This includes, for instance, testing whether a graph is outerplanar or a cactus graph. The query complexity of our algorithm in terms of the number of vertices in the graph, $n$, is $\tilde{O}(n^{2/3} / ε^5)$. Czumaj et~al.\ showed that cycle-freeness and $C_k$-minor freeness can be tested with query complexity $\tilde{O}(\sqrt{n})$ by using random walks, and that testing $H$-minor freeness for any $H$ that contains a cycles requires $Ω(\sqrt{n})$ queries. In contrast to these results, we analyze the structure of the graph and show that either we can find a subgraph of sublinear size that includes the forbidden minor $H$, or we can find a pair of disjoint subsets of vertices whose edge-cut is large, which induces an $H$-minor.
△ Less
Submitted 8 August, 2018; v1 submitted 19 July, 2017;
originally announced July 2017.
-
Testing bounded arboricity
Authors:
Talya Eden,
Reut Levi,
Dana Ron
Abstract:
In this paper we consider the problem of testing whether a graph has bounded arboricity. The family of graphs with bounded arboricity includes, among others, bounded-degree graphs, all minor-closed graph classes (e.g. planar graphs, graphs with bounded treewidth) and randomly generated preferential attachment graphs. Graphs with bounded arboricity have been studied extensively in the past, in part…
▽ More
In this paper we consider the problem of testing whether a graph has bounded arboricity. The family of graphs with bounded arboricity includes, among others, bounded-degree graphs, all minor-closed graph classes (e.g. planar graphs, graphs with bounded treewidth) and randomly generated preferential attachment graphs. Graphs with bounded arboricity have been studied extensively in the past, in particular since for many problems they allow for much more efficient algorithms and/or better approximation ratios.
We present a tolerant tester in the sparse-graphs model. The sparse-graphs model allows access to degree queries and neighbor queries, and the distance is defined with respect to the actual number of edges. More specifically, our algorithm distinguishes between graphs that are $ε$-close to having arboricity $α$ and graphs that $c \cdot ε$-far from having arboricity $3α$, where $c$ is an absolute small constant. The query complexity and running time of the algorithm are $\tilde{O}\left(\frac{n}{\sqrt{m}}\cdot \frac{\log(1/ε)}ε + \frac{n\cdot α}{m} \cdot \left(\frac{1}ε\right)^{O(\log(1/ε))}\right)$ where $n$ denotes the number of vertices and $m$ denotes the number of edges. In terms of the dependence on $n$ and $m$ this bound is optimal up to poly-logarithmic factors since $Ω(n/\sqrt{m})$ queries are necessary (and $α= O(\sqrt{m}))$.
We leave it as an open question whether the dependence on $1/ε$ can be improved from quasi-polynomial to polynomial. Our techniques include an efficient local simulation for approximating the outcome of a global (almost) forest-decomposition algorithm as well as a tailored procedure of edge sampling.
△ Less
Submitted 27 April, 2021; v1 submitted 16 July, 2017;
originally announced July 2017.
-
Faster and Simpler Distributed Algorithms for Testing and Correcting Graph Properties in the CONGEST-Model
Authors:
Guy Even,
Reut Levi,
Moti Medina
Abstract:
In this paper we present distributed testing algorithms of graph properties in the CONGEST-model [Censor-Hillel et al. 2016]. We present one-sided error testing algorithms in the general graph model.
We first describe a general procedure for converting $ε$-testers with a number of rounds $f(D)$, where $D$ denotes the diameter of the graph, to $O((\log n)/ε)+f((\log n)/ε)$ rounds, where $n$ is th…
▽ More
In this paper we present distributed testing algorithms of graph properties in the CONGEST-model [Censor-Hillel et al. 2016]. We present one-sided error testing algorithms in the general graph model.
We first describe a general procedure for converting $ε$-testers with a number of rounds $f(D)$, where $D$ denotes the diameter of the graph, to $O((\log n)/ε)+f((\log n)/ε)$ rounds, where $n$ is the number of processors of the network. We then apply this procedure to obtain an optimal tester, in terms of $n$, for testing bipartiteness, whose round complexity is $O(ε^{-1}\log n)$, which improves over the $poly(ε^{-1} \log n)$-round algorithm by Censor-Hillel et al. (DISC 2016). Moreover, for cycle-freeness, we obtain a \emph{corrector} of the graph that locally corrects the graph so that the corrected graph is acyclic. Note that, unlike a tester, a corrector needs to mend the graph in many places in the case that the graph is far from having the property.
In the second part of the paper we design algorithms for testing whether the network is $H$-free for any connected $H$ of size up to four with round complexity of $O(ε^{-1})$. This improves over the $O(ε^{-2})$-round algorithms for testing triangle freeness by Censor-Hillel et al. (DISC 2016) and for testing excluded graphs of size $4$ by Fraigniaud et al. (DISC 2016).
In the last part we generalize the global tester by Iwama and Yoshida (ITCS 2014) of testing $k$-path freeness to testing the exclusion of any tree of order $k$. We then show how to simulate this algorithm in the CONGEST-model in $O(k^{k^2+1}\cdotε^{-k})$ rounds.
△ Less
Submitted 13 May, 2017;
originally announced May 2017.
-
A Local Algorithm for the Sparse Spanning Graph Problem
Authors:
Christoph Lenzen,
Reut Levi
Abstract:
Constructing a sparse spanning subgraph is a fundamental primitive in graph theory. In this paper, we study this problem in the Centralized Local model, where the goal is to decide whether an edge is part of the spanning subgraph by examining only a small part of the input; yet, answers must be globally consistent and independent of prior queries.
Unfortunately, maximally sparse spanning subgrap…
▽ More
Constructing a sparse spanning subgraph is a fundamental primitive in graph theory. In this paper, we study this problem in the Centralized Local model, where the goal is to decide whether an edge is part of the spanning subgraph by examining only a small part of the input; yet, answers must be globally consistent and independent of prior queries.
Unfortunately, maximally sparse spanning subgraphs, i.e., spanning trees, cannot be constructed efficiently in this model. Therefore, we settle for a spanning subgraph containing at most $(1+\varepsilon)n$ edges (where $n$ is the number of vertices and $\varepsilon$ is a given approximation/sparsity parameter). We achieve query complexity of $\tilde{O}(poly(Δ/\varepsilon)n^{2/3})$, ($\tilde{O}$-notation hides polylogarithmic factors in $n$). where $Δ$ is the maximum degree of the input graph. Our algorithm is the first to do so on arbitrary bounded degree graphs. Moreover, we achieve the additional property that our algorithm outputs a spanner, i.e., distances are approximately preserved. With high probability, for each deleted edge there is a path of $O(poly(Δ/\varepsilon)\log^2 n)$ hops in the output that connects its endpoints.
△ Less
Submitted 18 July, 2017; v1 submitted 15 March, 2017;
originally announced March 2017.
-
A Local Algorithm for Constructing Spanners in Minor-Free Graphs
Authors:
Reut Levi,
Dana Ron,
Ronitt Rubinfeld
Abstract:
Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. We consider this problem in the setting of local algorithms: one wants to quickly determine whether a given edge $e$ is in a specific spanning tree, without computing the whole spanning tree, but rather by inspecting the local neighborhood of $e$. The challenge is to maintain consistency. That is, to answer que…
▽ More
Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. We consider this problem in the setting of local algorithms: one wants to quickly determine whether a given edge $e$ is in a specific spanning tree, without computing the whole spanning tree, but rather by inspecting the local neighborhood of $e$. The challenge is to maintain consistency. That is, to answer queries about different edges according to the same spanning tree. Since it is known that this problem cannot be solved without essentially viewing all the graph, we consider the relaxed version of finding a spanning subgraph with $(1+ε)n$ edges (where $n$ is the number of vertices and $ε$ is a given sparsity parameter). It is known that this relaxed problem requires inspecting $Ω(\sqrt{n})$ edges in general graphs, which motivates the study of natural restricted families of graphs. One such family is the family of graphs with an excluded minor. For this family there is an algorithm that achieves constant success probability, and inspects $(d/ε)^{poly(h)\log(1/ε)}$ edges (for each edge it is queried on), where $d$ is the maximum degree in the graph and $h$ is the size of the excluded minor. The distances between pairs of vertices in the spanning subgraph $G'$ are at most a factor of $poly(d, 1/ε, h)$ larger than in $G$.
In this work, we show that for an input graph that is $H$-minor free for any $H$ of size $h$, this task can be performed by inspecting only $poly(d, 1/ε, h)$ edges. The distances between pairs of vertices in the spanning subgraph $G'$ are at most a factor of $\tilde{O}(h\log(d)/ε)$ larger than in $G$. Furthermore, the error probability of the new algorithm is significantly improved to $Θ(1/n)$. This algorithm can also be easily adapted to yield an efficient algorithm for the distributed setting.
△ Less
Submitted 24 April, 2016;
originally announced April 2016.
-
Quantifying topological invariants of neuronal morphologies
Authors:
Lida Kanari,
Paweł Dłotko,
Martina Scolamiero,
Ran Levi,
Julian Shillcock,
Kathryn Hess,
Henry Markram
Abstract:
Nervous systems are characterized by neurons displaying a diversity of morphological shapes. Traditionally, different shapes have been qualitatively described based on visual inspection and quantitatively described based on morphometric parameters. Neither process provides a solid foundation for categorizing the various morphologies, a problem that is important in many fields. We propose a stable…
▽ More
Nervous systems are characterized by neurons displaying a diversity of morphological shapes. Traditionally, different shapes have been qualitatively described based on visual inspection and quantitatively described based on morphometric parameters. Neither process provides a solid foundation for categorizing the various morphologies, a problem that is important in many fields. We propose a stable topological measure as a standardized descriptor for any tree-like morphology, which encodes its skeletal branching anatomy. More specifically it is a barcode of the branching tree as determined by a spherical filtration centered at the root or neuronal soma. This Topological Morphology Descriptor (TMD) allows for the discrimination of groups of random and neuronal trees at linear computational cost.
△ Less
Submitted 28 March, 2016;
originally announced March 2016.
-
Sublinear Random Access Generators for Preferential Attachment Graphs
Authors:
Guy Even,
Reut Levi,
Moti Medina,
Adi Rosen
Abstract:
We consider the problem of sampling from a distribution on graphs, specifically when the distribution is defined by an evolving graph model, and consider the time, space and randomness complexities of such samplers.
In the standard approach, the whole graph is chosen randomly according to the randomized evolving process, stored in full, and then queries on the sampled graph are answered by simpl…
▽ More
We consider the problem of sampling from a distribution on graphs, specifically when the distribution is defined by an evolving graph model, and consider the time, space and randomness complexities of such samplers.
In the standard approach, the whole graph is chosen randomly according to the randomized evolving process, stored in full, and then queries on the sampled graph are answered by simply accessing the stored graph. This may require prohibitive amounts of time, space and random bits, especially when only a small number of queries are actually issued. Instead, we propose to generate the graph on-the-fly, in response to queries, and therefore to require amounts of time, space, and random bits which are a function of the actual number of queries.
We focus on two random graph models: the Barab{á}si-Albert Preferential Attachment model (BA-graphs) and the random recursive tree model. We give on-the-fly generation algorithms for both models. With probability $1-1/\mbox{poly}(n)$, each and every query is answered in $\mbox{polylog}(n)$ time, and the increase in space and the number of random bits consumed by any single query are both $\mbox{polylog}(n)$, where $n$ denotes the number of vertices in the graph.
Our results show that, although the BA random graph model is defined by a sequential process, efficient random access to the graph's nodes is possible. In addition to the conceptual contribution, efficient on-the-fly generation of random graphs can serve as a tool for the efficient simulation of sublinear algorithms over large BA-graphs, and the efficient estimation of their performance on such graphs.
△ Less
Submitted 19 May, 2017; v1 submitted 19 February, 2016;
originally announced February 2016.
-
Non-Local Probes Do Not Help with Graph Problems
Authors:
Mika Göös,
Juho Hirvonen,
Reut Levi,
Moti Medina,
Jukka Suomela
Abstract:
This work bridges the gap between distributed and centralised models of computing in the context of sublinear-time graph algorithms. A priori, typical centralised models of computing (e.g., parallel decision trees or centralised local algorithms) seem to be much more powerful than distributed message-passing algorithms: centralised algorithms can directly probe any part of the input, while in dist…
▽ More
This work bridges the gap between distributed and centralised models of computing in the context of sublinear-time graph algorithms. A priori, typical centralised models of computing (e.g., parallel decision trees or centralised local algorithms) seem to be much more powerful than distributed message-passing algorithms: centralised algorithms can directly probe any part of the input, while in distributed algorithms nodes can only communicate with their immediate neighbours. We show that for a large class of graph problems, this extra freedom does not help centralised algorithms at all: for example, efficient stateless deterministic centralised local algorithms can be simulated with efficient distributed message-passing algorithms. In particular, this enables us to transfer existing lower bound results from distributed algorithms to centralised local algorithms.
△ Less
Submitted 16 December, 2015;
originally announced December 2015.
-
Dynamic Allocation Problems in Loss Network Systems with Advanced Reservation
Authors:
Retsef Levi,
Cong Shi
Abstract:
We consider a class of well-known dynamic resource allocation models in loss network systems with advanced reservation. The most important performance measure in any loss network system is to compute its blocking probability, i.e., the probability of an arriving customer in equilibrium finds a fully utilized system (thereby getting rejected by the system). In this paper, we derive upper bounds on…
▽ More
We consider a class of well-known dynamic resource allocation models in loss network systems with advanced reservation. The most important performance measure in any loss network system is to compute its blocking probability, i.e., the probability of an arriving customer in equilibrium finds a fully utilized system (thereby getting rejected by the system). In this paper, we derive upper bounds on the asymptotic blocking probabilities for such systems in high-volume regimes. There have been relatively few results on loss network systems with advanced reservation due to its inherent complexity. The theoretical results find applications in a wide class of revenue management problems in systems with reusable resources and advanced reservation, e.g., hotel room, car rental and workforce management. We propose a simple control policy called the improved class selection policy (ICSP) based on solving a continuous knapsack problem, similar in spirit to the one proposed in Levi and Radovanovic (2010). Using our results derived for loss network systems with advanced reservation, we show the ICSP performs asymptotically near-optimal in high-volume regimes.
△ Less
Submitted 14 May, 2015;
originally announced May 2015.
-
Local Computation Algorithms for Graphs of Non-Constant Degrees
Authors:
Reut Levi,
Ronitt Rubinfeld,
Anak Yodpinyanee
Abstract:
In the model of \emph{local computation algorithms} (LCAs), we aim to compute the queried part of the output by examining only a small (sublinear) portion of the input. Many recently developed LCAs on graph problems achieve time and space complexities with very low dependence on $n$, the number of vertices. Nonetheless, these complexities are generally at least exponential in $d$, the upper bound…
▽ More
In the model of \emph{local computation algorithms} (LCAs), we aim to compute the queried part of the output by examining only a small (sublinear) portion of the input. Many recently developed LCAs on graph problems achieve time and space complexities with very low dependence on $n$, the number of vertices. Nonetheless, these complexities are generally at least exponential in $d$, the upper bound on the degree of the input graph. Instead, we consider the case where parameter $d$ can be moderately dependent on $n$, and aim for complexities with subexponential dependence on $d$, while maintaining polylogarithmic dependence on $n$. We present: a randomized LCA for computing maximal independent sets whose time and space complexities are quasi-polynomial in $d$ and polylogarithmic in $n$; for constant $ε> 0$, a randomized LCA that provides a $(1-ε)$-approximation to maximum matching whose time and space complexities are polynomial in $d$ and polylogarithmic in $n$.
△ Less
Submitted 13 February, 2015;
originally announced February 2015.
-
Constructing Near Spanning Trees with Few Local Inspections
Authors:
Reut Levi,
Guy Moshkovitz,
Dana Ron,
Ronitt Rubinfeld,
Asaf Shapira
Abstract:
Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. Motivated by several recent studies of local graph algorithms, we consider the following variant of this problem. Let G be a connected bounded-degree graph. Given an edge $e$ in $G$ we would like to decide whether $e$ belongs to a connected subgraph $G'$ consisting of $(1+ε)n$ edges (for a prespecified constant…
▽ More
Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. Motivated by several recent studies of local graph algorithms, we consider the following variant of this problem. Let G be a connected bounded-degree graph. Given an edge $e$ in $G$ we would like to decide whether $e$ belongs to a connected subgraph $G'$ consisting of $(1+ε)n$ edges (for a prespecified constant $ε>0$), where the decision for different edges should be consistent with the same subgraph $G'$. Can this task be performed by inspecting only a {\em constant} number of edges in $G$? Our main results are:
(1) We show that if every $t$-vertex subgraph of $G$ has expansion $1/(\log t)^{1+o(1)}$ then one can (deterministically) construct a sparse spanning subgraph $G'$ of $G$ using few inspections. To this end we analyze a "local" version of a famous minimum-weight spanning tree algorithm.
(2) We show that the above expansion requirement is sharp even when allowing randomization. To this end we construct a family of $3$-regular graphs of high girth, in which every $t$-vertex subgraph has expansion $1/(\log t)^{1-o(1)}$.
△ Less
Submitted 3 February, 2015; v1 submitted 2 February, 2015;
originally announced February 2015.
-
Local Algorithms for Sparse Spanning Graphs
Authors:
Reut Levi,
Dana Ron,
Ronitt Rubinfeld
Abstract:
Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. We consider a relaxed version of this problem in the setting of local algorithms. The relaxation is that the constructed subgraph is a sparse spanning subgraph containing at most $(1+ε)n$ edges (where $n$ is the number of vertices and $ε$ is a given approximation/sparsity parameter). In the local setting, the g…
▽ More
Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. We consider a relaxed version of this problem in the setting of local algorithms. The relaxation is that the constructed subgraph is a sparse spanning subgraph containing at most $(1+ε)n$ edges (where $n$ is the number of vertices and $ε$ is a given approximation/sparsity parameter). In the local setting, the goal is to quickly determine whether a given edge $e$ belongs to such a subgraph, without constructing the whole subgraph, but rather by inspecting (querying) the local neighborhood of $e$. The challenge is to maintain consistency. That is, to provide answers concerning different edges according to the same spanning subgraph.
We first show that for general bounded-degree graphs, the query complexity of any such algorithm must be $Ω(\sqrt{n})$. This lower bound holds for constant-degree graphs that have high expansion. Next we design an algorithm for (bounded-degree) graphs with high expansion, obtaining a result that roughly matches the lower bound. We then turn to study graphs that exclude a fixed minor (and are hence non-expanding). We design an algorithm for such graphs, which may have an unbounded maximum degree. The query complexity of this algorithm is $poly(1/ε, h)$ (independent of $n$ and the maximum degree), where $h$ is the number of vertices in the excluded minor.
Though our two algorithms are designed for very different types of graphs (and have very different complexities), on a high-level there are several similarities, and we highlight both the similarities and the differences.
△ Less
Submitted 27 April, 2021; v1 submitted 14 February, 2014;
originally announced February 2014.
-
A Quasi-Polynomial Time Partition Oracle for Graphs with an Excluded Minor
Authors:
Reut Levi,
Dana Ron
Abstract:
Motivated by the problem of testing planarity and related properties, we study the problem of designing efficient {\em partition oracles}. A {\em partition oracle} is a procedure that, given access to the incidence lists representation of a bounded-degree graph $G= (V,E)$ and a parameter $\eps$, when queried on a vertex $v\in V$, returns the part (subset of vertices) which $v$ belongs to in a part…
▽ More
Motivated by the problem of testing planarity and related properties, we study the problem of designing efficient {\em partition oracles}. A {\em partition oracle} is a procedure that, given access to the incidence lists representation of a bounded-degree graph $G= (V,E)$ and a parameter $\eps$, when queried on a vertex $v\in V$, returns the part (subset of vertices) which $v$ belongs to in a partition of all graph vertices. The partition should be such that all parts are small, each part is connected, and if the graph has certain properties, the total number of edges between parts is at most $\eps |V|$. In this work we give a partition oracle for graphs with excluded minors whose query complexity is quasi-polynomial in $1/\eps$, thus improving on the result of Hassidim et al. ({\em Proceedings of FOCS 2009}) who gave a partition oracle with query complexity exponential in $1/\eps$. This improvement implies corresponding improvements in the complexity of testing planarity and other properties that are characterized by excluded minors as well as sublinear-time approximation algorithms that work under the promise that the graph has an excluded minor.
△ Less
Submitted 14 February, 2013;
originally announced February 2013.
-
A simple online competitive adaptation of Lempel-Ziv compression with efficient random access support
Authors:
Akashnil Dutta,
Reut Levi,
Dana Ron,
Ronitt Rubinfeld
Abstract:
We present a simple adaptation of the Lempel Ziv 78' (LZ78) compression scheme ({\em IEEE Transactions on Information Theory, 1978}) that supports efficient random access to the input string. Namely, given query access to the compressed string, it is possible to efficiently recover any symbol of the input string. The compression algorithm is given as input a parameter $\eps >0$, and with very high…
▽ More
We present a simple adaptation of the Lempel Ziv 78' (LZ78) compression scheme ({\em IEEE Transactions on Information Theory, 1978}) that supports efficient random access to the input string. Namely, given query access to the compressed string, it is possible to efficiently recover any symbol of the input string. The compression algorithm is given as input a parameter $\eps >0$, and with very high probability increases the length of the compressed string by at most a factor of $(1+\eps)$. The access time is $O(\log n + 1/\eps^2)$ in expectation, and $O(\log n/\eps^2)$ with high probability. The scheme relies on sparse transitive-closure spanners. Any (consecutive) substring of the input string can be retrieved at an additional additive cost in the running time of the length of the substring. We also formally establish the necessity of modifying LZ78 so as to allow efficient random access. Specifically, we construct a family of strings for which $Ω(n/\log n)$ queries to the LZ78-compressed string are required in order to recover a single symbol in the input string. The main benefit of the proposed scheme is that it preserves the online nature and simplicity of LZ78, and that for {\em every} input string, the length of the compressed string is only a small factor larger than that obtained by running LZ78.
△ Less
Submitted 11 January, 2013;
originally announced January 2013.