-
Approximately Counting and Sampling Hamiltonian Motifs in Sublinear Time
Authors:
Talya Eden,
Reut Levi,
Dana Ron,
Ronitt Rubinfeld
Abstract:
Counting small subgraphs, referred to as motifs, in large graphs is a fundamental task in graph analysis, extensively studied across various contexts and computational models. In the sublinear-time regime, the relaxed problem of approximate counting has been explored within two prominent query frameworks: the standard model, which permits degree, neighbor, and pair queries, and the strictly more p…
▽ More
Counting small subgraphs, referred to as motifs, in large graphs is a fundamental task in graph analysis, extensively studied across various contexts and computational models. In the sublinear-time regime, the relaxed problem of approximate counting has been explored within two prominent query frameworks: the standard model, which permits degree, neighbor, and pair queries, and the strictly more powerful augmented model, which additionally allows for uniform edge sampling. Currently, in the standard model, (optimal) results have been established only for approximately counting edges, stars, and cliques, all of which have a radius of one. This contrasts sharply with the state of affairs in the augmented model, where algorithmic results (some of which are optimal) are known for any input motif, leading to a disparity which we term the ``scope gap" between the two models.
In this work, we make significant progress in bridging this gap. Our approach draws inspiration from recent advancements in the augmented model and utilizes a framework centered on counting by uniform sampling, thus allowing us to establish new results in the standard model and simplify on previous results.
In particular, our first, and main, contribution is a new algorithm in the standard model for approximately counting any Hamiltonian motif in sublinear time. Our second contribution is a variant of our algorithm that enables nearly uniform sampling of these motifs, a capability previously limited in the standard model to edges and cliques. Our third contribution is to introduce even simpler algorithms for stars and cliques by exploiting their radius-one property. As a result, we simplify all previously known algorithms in the standard model for stars (Gonen, Ron, Shavitt (SODA 2010)), triangles (Eden, Levi, Ron Seshadhri (FOCS 2015)) and cliques (Eden, Ron, Seshadri (STOC 2018)).
△ Less
Submitted 12 March, 2025;
originally announced March 2025.
-
Time-Dependent Network Topology Optimization for LEO Satellite Constellations
Authors:
Dara Ron,
Faisal Ahmed Yusufzai,
Sebastian Kwakye,
Satyaki Roy,
Nishanth Sastry,
Vijay K. Shah
Abstract:
Today's Low Earth Orbit (LEO) satellite networks, exemplified by SpaceX's Starlink, play a crucial role in delivering global internet access to millions of users. However, managing the dynamic and expansive nature of these networks poses significant challenges in designing optimal satellite topologies over time. In this paper, we introduce the \underline{D}ynamic Time-Expanded Graph (DTEG)-based \…
▽ More
Today's Low Earth Orbit (LEO) satellite networks, exemplified by SpaceX's Starlink, play a crucial role in delivering global internet access to millions of users. However, managing the dynamic and expansive nature of these networks poses significant challenges in designing optimal satellite topologies over time. In this paper, we introduce the \underline{D}ynamic Time-Expanded Graph (DTEG)-based \underline{O}ptimal \underline{T}opology \underline{D}esign (DoTD) algorithm to tackle these challenges effectively. We first formulate a novel space network topology optimization problem encompassing a multi-objective function -- maximize network capacity, minimize latency, and mitigate link churn -- under key inter-satellite link constraints. Our proposed approach addresses this optimization problem by transforming the objective functions and constraints into a time-dependent scoring function. This empowers each LEO satellite to assess potential connections based on their dynamic performance scores, ensuring robust network performance over time without scalability issues. Additionally, we provide proof of the score function's boundary to prove that it will not approach infinity, thus allowing each satellite to consistently evaluate others over time. For evaluation purposes, we utilize a realistic Mininet-based LEO satellite emulation tool that leverages Starlink's Two-Line Element (TLE) data. Comparative evaluation against two baseline methods -- Greedy and $+$Grid, demonstrates the superior performance of our algorithm in optimizing network efficiency and resilience.
△ Less
Submitted 22 January, 2025;
originally announced January 2025.
-
Testing $C_k$-freeness in bounded-arboricity graphs
Authors:
Talya Eden,
Reut Levi,
Dana Ron
Abstract:
We study the problem of testing $C_k$-freeness ($k$-cycle-freeness) for fixed constant $k > 3$ in graphs with bounded arboricity (but unbounded degrees). In particular, we are interested in one-sided error algorithms, so that they must detect a copy of $C_k$ with high constant probability when the graph is $ε$-far from $C_k$-free. We next state our results for constant arboricity and constant $ε$…
▽ More
We study the problem of testing $C_k$-freeness ($k$-cycle-freeness) for fixed constant $k > 3$ in graphs with bounded arboricity (but unbounded degrees). In particular, we are interested in one-sided error algorithms, so that they must detect a copy of $C_k$ with high constant probability when the graph is $ε$-far from $C_k$-free. We next state our results for constant arboricity and constant $ε$ with a focus on the dependence on the number of graph vertices, $n$. The query complexity of all our algorithms grows polynomially with $1/ε$. (1) As opposed to the case of $k=3$, where the complexity of testing $C_3$-freeness grows with the arboricity of the graph but not with the size of the graph (Levi, ICALP 2021) this is no longer the case already for $k=4$. We show that $Ω(n^{1/4})$ queries are necessary for testing $C_4$-freeness, and that $\widetilde{O}(n^{1/4})$ are sufficient. The same bounds hold for $C_5$. (2) For every fixed $k \geq 6$, any one-sided error algorithm for testing $C_k$-freeness must perform $Ω(n^{1/3})$ queries. (3) For $k=6$ we give a testing algorithm whose query complexity is $\widetilde{O}(n^{1/2})$. (4) For any fixed $k$, the query complexity of testing $C_k$-freeness is upper bounded by ${O}(n^{1-1/\lfloor k/2\rfloor})$.
Our $Ω(n^{1/4})$ lower bound for testing $C_4$-freeness in constant arboricity graphs provides a negative answer to an open problem posed by (Goldreich, 2021).
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Sample-based distance-approximation for subsequence-freeness
Authors:
Omer Cohen Sidon,
Dana Ron
Abstract:
In this work, we study the problem of approximating the distance to subsequence-freeness in the sample-based distribution-free model. For a given subsequence (word) $w = w_1 \dots w_k$, a sequence (text) $T = t_1 \dots t_n$ is said to contain $w$ if there exist indices $1 \leq i_1 < \dots < i_k \leq n$ such that $t_{i_{j}} = w_j$ for every $1 \leq j \leq k$. Otherwise, $T$ is $w$-free. Ron and Ros…
▽ More
In this work, we study the problem of approximating the distance to subsequence-freeness in the sample-based distribution-free model. For a given subsequence (word) $w = w_1 \dots w_k$, a sequence (text) $T = t_1 \dots t_n$ is said to contain $w$ if there exist indices $1 \leq i_1 < \dots < i_k \leq n$ such that $t_{i_{j}} = w_j$ for every $1 \leq j \leq k$. Otherwise, $T$ is $w$-free. Ron and Rosin (ACM TOCT 2022) showed that the number of samples both necessary and sufficient for one-sided error testing of subsequence-freeness in the sample-based distribution-free model is $Θ(k/ε)$. Denoting by $Δ(T,w,p)$ the distance of $T$ to $w$-freeness under a distribution $p :[n]\to [0,1]$, we are interested in obtaining an estimate $\widehatΔ$, such that $|\widehatΔ - Δ(T,w,p)| \leq δ$ with probability at least $2/3$, for a given distance parameter $δ$. Our main result is an algorithm whose sample complexity is $\tilde{O}(k^2/δ^2)$. We first present an algorithm that works when the underlying distribution $p$ is uniform, and then show how it can be modified to work for any (unknown) distribution $p$. We also show that a quadratic dependence on $1/δ$ is necessary.
△ Less
Submitted 2 May, 2023;
originally announced May 2023.
-
Testing Distributions of Huge Objects
Authors:
Oded Goldreich,
Dana Ron
Abstract:
We initiate a study of a new model of property testing that is a hybrid of testing properties of distributions and testing properties of strings. Specifically, the new model refers to testing properties of distributions, but these are distributions over huge objects (i.e., very long strings). Accordingly, the model accounts for the total number of local probes into these objects (resp., queries to…
▽ More
We initiate a study of a new model of property testing that is a hybrid of testing properties of distributions and testing properties of strings. Specifically, the new model refers to testing properties of distributions, but these are distributions over huge objects (i.e., very long strings). Accordingly, the model accounts for the total number of local probes into these objects (resp., queries to the strings) as well as for the distance between objects (resp., strings), and the distance between distributions is defined as the earth mover's distance with respect to the relative Hamming distance between strings.
We study the query complexity of testing in this new model, focusing on three directions. First, we try to relate the query complexity of testing properties in the new model to the sample complexity of testing these properties in the standard distribution testing model. Second, we consider the complexity of testing properties that arise naturally in the new model (e.g., distributions that capture random variations of fixed strings). Third, we consider the complexity of testing properties that were extensively studied in the standard distribution testing model: Two such cases are uniform distributions and pairs of identical distributions.
△ Less
Submitted 30 December, 2023; v1 submitted 24 December, 2022;
originally announced December 2022.
-
Reducing Computational Complexity of Neural Networks in Optical Channel Equalization: From Concepts to Implementation
Authors:
Pedro J. Freire,
Antonio Napoli,
Diego Arguello Ron,
Bernhard Spinnler,
Michael Anderson,
Wolfgang Schairer,
Thomas Bex,
Nelson Costa,
Sergei K. Turitsyn,
Jaroslaw E. Prilepsky
Abstract:
In this paper, a new methodology is proposed that allows for the low-complexity development of neural network (NN) based equalizers for the mitigation of impairments in high-speed coherent optical transmission systems. In this work, we provide a comprehensive description and comparison of various deep model compression approaches that have been applied to feed-forward and recurrent NN designs. Add…
▽ More
In this paper, a new methodology is proposed that allows for the low-complexity development of neural network (NN) based equalizers for the mitigation of impairments in high-speed coherent optical transmission systems. In this work, we provide a comprehensive description and comparison of various deep model compression approaches that have been applied to feed-forward and recurrent NN designs. Additionally, we evaluate the influence these strategies have on the performance of each NN equalizer. Quantization, weight clustering, pruning, and other cutting-edge strategies for model compression are taken into consideration. In this work, we propose and evaluate a Bayesian optimization-assisted compression, in which the hyperparameters of the compression are chosen to simultaneously reduce complexity and improve performance. In conclusion, the trade-off between the complexity of each compression approach and its performance is evaluated by utilizing both simulated and experimental data in order to complete the analysis. By utilizing optimal compression approaches, we show that it is possible to design an NN-based equalizer that is simpler to implement and has better performance than the conventional digital back-propagation (DBP) equalizer with only one step per span. This is accomplished by reducing the number of multipliers used in the NN equalizer after applying the weighted clustering and pruning algorithms. Furthermore, we demonstrate that an equalizer based on NN can also achieve superior performance while still maintaining the same degree of complexity as the full electronic chromatic dispersion compensation block. We conclude our analysis by highlighting open questions and existing challenges, as well as possible future research directions.
△ Less
Submitted 26 November, 2022; v1 submitted 26 August, 2022;
originally announced August 2022.
-
Stochastic resonance neurons in artificial neural networks
Authors:
Egor Manuylovich,
Diego Argüello Ron,
Morteza Kamalian-Kopae,
Sergei Turitsyn
Abstract:
Many modern applications of the artificial neural networks ensue large number of layers making traditional digital implementations increasingly complex. Optical neural networks offer parallel processing at high bandwidth, but have the challenge of noise accumulation. We propose here a new type of neural networks using stochastic resonances as an inherent part of the architecture and demonstrate a…
▽ More
Many modern applications of the artificial neural networks ensue large number of layers making traditional digital implementations increasingly complex. Optical neural networks offer parallel processing at high bandwidth, but have the challenge of noise accumulation. We propose here a new type of neural networks using stochastic resonances as an inherent part of the architecture and demonstrate a possibility of significant reduction of the required number of neurons for a given performance accuracy. We also show that such a neural network is more robust against the impact of noise.
△ Less
Submitted 23 August, 2022; v1 submitted 6 May, 2022;
originally announced May 2022.
-
The Structure of Configurations in One-Dimensional Majority Cellular Automata: From Cell Stability to Configuration Periodicity
Authors:
Yonatan Nakar,
Dana Ron
Abstract:
We study the dynamics of (synchronous) one-dimensional cellular automata with cyclical boundary conditions that evolve according to the majority rule with radius $ r $. We introduce a notion that we term cell stability with which we express the structure of the possible configurations that could emerge in this setting. Our main finding is that apart from the configurations of the form…
▽ More
We study the dynamics of (synchronous) one-dimensional cellular automata with cyclical boundary conditions that evolve according to the majority rule with radius $ r $. We introduce a notion that we term cell stability with which we express the structure of the possible configurations that could emerge in this setting. Our main finding is that apart from the configurations of the form $ (0^{r+1}0^* + 1^{r+1}1^*)^* $, which are always fixed-points, the other configurations that the automata could possibly converge to, which are known to be either fixed-points or 2-cycles, have a particular spatially periodic structure. Namely, each of these configurations is of the form $ s^* $ where $ s $ consists of $ O(r^2) $ consecutive sequences of cells with the same state, each such sequence is of length at most $ r $, and the total length of $ s $ is $ O(r^2) $ as well. We show that an analogous result also holds for the minority rule.
△ Less
Submitted 3 June, 2022; v1 submitted 18 May, 2022;
originally announced May 2022.
-
Approximating the Arboricity in Sublinear Time
Authors:
Talya Eden,
Saleet Mossel,
Dana Ron
Abstract:
We consider the problem of approximating the arboricity of a graph $G= (V,E)$, which we denote by $\mathsf{arb}(G)$, in sublinear time, where the arboricity of a graph is the minimal number of forests required to cover its edges. An algorithm for this problem may perform degree and neighbor queries, and is allowed a small error probability. We design an algorithm that outputs an estimate $\hatα$,…
▽ More
We consider the problem of approximating the arboricity of a graph $G= (V,E)$, which we denote by $\mathsf{arb}(G)$, in sublinear time, where the arboricity of a graph is the minimal number of forests required to cover its edges. An algorithm for this problem may perform degree and neighbor queries, and is allowed a small error probability. We design an algorithm that outputs an estimate $\hatα$, such that with probability $1-1/\textrm{poly}(n)$, $\mathsf{arb}(G)/c\log^2 n \leq \hatα \leq \mathsf{arb}(G)$, where $n=|V|$ and $c$ is a constant. The expected query complexity and running time of the algorithm are
$O(n/\mathsf{arb}(G))\cdot \textrm{poly}(\log n)$, and this upper bound also holds with high probability. %($\widetilde{O}(\cdot)$ is used to suppress $\textrm{poly}(\log n)$ dependencies).
This bound is optimal for such an approximation up to a $\textrm{poly}(\log n)$ factor.
△ Less
Submitted 28 October, 2021;
originally announced October 2021.
-
Testing Dynamic Environments: Back to Basics
Authors:
Yonatan Nakar,
Dana Ron
Abstract:
We continue the line of work initiated by Goldreich and Ron (Journal of the ACM, 2017) on testing dynamic environments and propose to pursue a systematic study of the complexity of testing basic dynamic environments and local rules. As a first step, in this work we focus on dynamic environments that correspond to elementary cellular automata that evolve according to threshold rules.
Our main res…
▽ More
We continue the line of work initiated by Goldreich and Ron (Journal of the ACM, 2017) on testing dynamic environments and propose to pursue a systematic study of the complexity of testing basic dynamic environments and local rules. As a first step, in this work we focus on dynamic environments that correspond to elementary cellular automata that evolve according to threshold rules.
Our main result is the identification of a set of conditions on local rules, and a meta-algorithm that tests evolution according to local rules that satisfy the conditions. The meta-algorithm has query complexity poly$ (1/ε) $, is non-adaptive and has one-sided error. We show that all the threshold rules satisfy the set of conditions, and therefore are poly$ (1/ε) $-testable. We believe that this is a rich area of research and suggest a variety of open problems and natural research directions that may extend and expand our results.
△ Less
Submitted 4 May, 2021; v1 submitted 3 May, 2021;
originally announced May 2021.
-
Almost Optimal Bounds for Sublinear-Time Sampling of $k$-Cliques: Sampling Cliques is Harder Than Counting
Authors:
Talya Eden,
Dana Ron,
Will Rosenbaum
Abstract:
In this work, we consider the problem of sampling a $k$-clique in a graph from an almost uniform distribution in sublinear time in the general graph query model. Specifically the algorithm should output each $k$-clique with probability $(1\pm ε)/n_k$, where $n_k$ denotes the number of $k$-cliques in the graph and $ε$ is a given approximation parameter.
We prove that the query complexity of this…
▽ More
In this work, we consider the problem of sampling a $k$-clique in a graph from an almost uniform distribution in sublinear time in the general graph query model. Specifically the algorithm should output each $k$-clique with probability $(1\pm ε)/n_k$, where $n_k$ denotes the number of $k$-cliques in the graph and $ε$ is a given approximation parameter.
We prove that the query complexity of this problem is \[ Θ^*\left(\max\left\{ \left(\frac{(nα)^{k/2}}{ n_k}\right)^{\frac{1}{k-1}} ,\; \min\left\{nα,\frac{nα^{k-1}}{n_k} \right\}\right\}\right). \] where $n$ is the number of vertices in the graph, $α$ is its arboricity, and $Θ^*$ suppresses the dependence on $(\log n/ε)^{O(k)}$. Interestingly, this establishes a separation between approximate counting and approximate uniform sampling in the sublinear regime. For example, if $k=3$, $α= O(1)$, and $n_3$ (the number of triangles) is $Θ(n)$, then we get a lower bound of $Ω(n^{1/4})$ (for constant $ε$), while under these conditions, a $(1\pm ε)$-approximation of $n_3$ can be obtained by performing $\textrm{poly}(\log(n/ε))$ queries (Eden, Ron and Seshadhri, SODA20).
Our lower bound follows from a construction of a family of graphs with arboricity $α$ such that in each graph there are $n_k$ cliques (of size $k$), where one of these cliques is "hidden" and hence hard to sample. Our upper bound is based on defining a special auxiliary graph $H_k$, such that sampling edges almost uniformly in $H_k$ translates to sampling $k$-cliques almost uniformly in the original graph $G$. We then build on a known edge-sampling algorithm (Eden, Ron and Rosenbaum, ICALP19) to sample edges in $H_k$, where the challenge is simulate queries to $H_k$ while being given access only to $G$.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
On Efficient Distance Approximation for Graph Properties
Authors:
Nimrod Fiat,
Dana Ron
Abstract:
A distance-approximation algorithm for a graph property $\mathcal{P}$ in the adjacency-matrix model is given an approximation parameter $ε\in (0,1)$ and query access to the adjacency matrix of a graph $G=(V,E)$. It is required to output an estimate of the \emph{distance} between $G$ and the closest graph $G'=(V,E')$ that satisfies $\mathcal{P}$, where the distance between graphs is the size of the…
▽ More
A distance-approximation algorithm for a graph property $\mathcal{P}$ in the adjacency-matrix model is given an approximation parameter $ε\in (0,1)$ and query access to the adjacency matrix of a graph $G=(V,E)$. It is required to output an estimate of the \emph{distance} between $G$ and the closest graph $G'=(V,E')$ that satisfies $\mathcal{P}$, where the distance between graphs is the size of the symmetric difference between their edge sets, normalized by $|V|^2$. In this work we introduce property covers, as a framework for using distance-approximation algorithms for "simple" properties to design distance-approximation. Applying this framework we present distance-approximation algorithms with $poly(1/ε)$ query complexity for induced $P_3$-freeness, induced $P_4$-freeness, and Chordality. For induced $C_4$-freeness our algorithm has query complexity $exp(poly(1/ε))$. These complexities essentially match the corresponding known results for testing these properties and provide an exponential improvement on previously known results.
△ Less
Submitted 6 January, 2020;
originally announced January 2020.
-
Property testing of the Boolean and binary rank
Authors:
Michal Parnas,
Dana Ron,
Adi Shraibman
Abstract:
We present algorithms for testing if a $(0,1)$-matrix $M$ has Boolean/binary rank at most $d$, or is $ε$-far from Boolean/binary rank $d$ (i.e., at least an $ε$-fraction of the entries in $M$ must be modified so that it has rank at most $d$).
The query complexity of our testing algorithm for the Boolean rank is $\tilde{O}\left(d^4/ ε^6\right)$. For the binary rank we present a testing algorithm…
▽ More
We present algorithms for testing if a $(0,1)$-matrix $M$ has Boolean/binary rank at most $d$, or is $ε$-far from Boolean/binary rank $d$ (i.e., at least an $ε$-fraction of the entries in $M$ must be modified so that it has rank at most $d$).
The query complexity of our testing algorithm for the Boolean rank is $\tilde{O}\left(d^4/ ε^6\right)$. For the binary rank we present a testing algorithm whose query complexity is $O(2^{2d}/ε)$.
Both algorithms are $1$-sided error algorithms that always accept $M$ if it has Boolean/binary rank at most $d$, and reject with probability at least $2/3$ if $M$ is $ε$-far from Boolean/binary rank $d$.
△ Less
Submitted 30 August, 2019;
originally announced August 2019.
-
The Arboricity Captures the Complexity of Sampling Edges
Authors:
Talya Eden,
Dana Ron,
Will Rosenbaum
Abstract:
In this paper, we revisit the problem of sampling edges in an unknown graph $G = (V, E)$ from a distribution that is (pointwise) almost uniform over $E$. We consider the case where there is some a priori upper bound on the arboriciy of $G$. Given query access to a graph $G$ over $n$ vertices and of average degree $d$ and arboricity at most $α$, we design an algorithm that performs…
▽ More
In this paper, we revisit the problem of sampling edges in an unknown graph $G = (V, E)$ from a distribution that is (pointwise) almost uniform over $E$. We consider the case where there is some a priori upper bound on the arboriciy of $G$. Given query access to a graph $G$ over $n$ vertices and of average degree $d$ and arboricity at most $α$, we design an algorithm that performs $O\!\left(\fracα{d} \cdot \frac{\log^3 n}{\varepsilon}\right)$ queries in expectation and returns an edge in the graph such that every edge $e \in E$ is sampled with probability $(1 \pm \varepsilon)/m$. The algorithm performs two types of queries: degree queries and neighbor queries. We show that the upper bound is tight (up to poly-logarithmic factors and the dependence in $\varepsilon$), as $Ω\!\left(\fracα{d} \right)$ queries are necessary for the easier task of sampling edges from any distribution over $E$ that is close to uniform in total variational distance. We also prove that even if $G$ is a tree (i.e., $α= 1$ so that $\fracα{d}=Θ(1)$), $Ω\left(\frac{\log n}{\log\log n}\right)$ queries are necessary to sample an edge from any distribution that is pointwise close to uniform, thus establishing that a $\mathrm{poly}(\log n)$ factor is necessary for constant $α$. Finally we show how our algorithm can be applied to obtain a new result on approximately counting subgraphs, based on the recent work of Assadi, Kapralov, and Khanna (ITCS, 2019).
△ Less
Submitted 21 February, 2019;
originally announced February 2019.
-
Faster sublinear approximations of $k$-cliques for low arboricity graphs
Authors:
Talya Eden,
Dana Ron,
C. Seshadhri
Abstract:
Given query access to an undirected graph $G$, we consider the problem of computing a $(1\pmε)$-approximation of the number of $k$-cliques in $G$. The standard query model for general graphs allows for degree queries, neighbor queries, and pair queries. Let $n$ be the number of vertices, $m$ be the number of edges, and $n_k$ be the number of $k$-cliques. Previous work by Eden, Ron and Seshadhri (S…
▽ More
Given query access to an undirected graph $G$, we consider the problem of computing a $(1\pmε)$-approximation of the number of $k$-cliques in $G$. The standard query model for general graphs allows for degree queries, neighbor queries, and pair queries. Let $n$ be the number of vertices, $m$ be the number of edges, and $n_k$ be the number of $k$-cliques. Previous work by Eden, Ron and Seshadhri (STOC 2018) gives an $O^*(\frac{n}{n^{1/k}_k} + \frac{m^{k/2}}{n_k})$-time algorithm for this problem (we use $O^*(\cdot)$ to suppress $\poly(\log n, 1/ε, k^k)$ dependencies). Moreover, this bound is nearly optimal when the expression is sublinear in the size of the graph.
Our motivation is to circumvent this lower bound, by parameterizing the complexity in terms of \emph{graph arboricity}. The arboricity of $G$ is a measure for the graph density "everywhere". We design an algorithm for the class of graphs with arboricity at most $α$, whose running time is $O^*(\min\{\frac{nα^{k-1}}{n_k},\, \frac{n}{n_k^{1/k}}+\frac{m α^{k-2}}{n_k} \})$. We also prove a nearly matching lower bound. For all graphs, the arboricity is $O(\sqrt m)$, so this bound subsumes all previous results on sublinear clique approximation.
As a special case of interest, consider minor-closed families of graphs, which have constant arboricity. Our result implies that for any minor-closed family of graphs, there is a $(1\pmε)$-approximation algorithm for $n_k$ that has running time $O^*(\frac{n}{n_k})$. Such a bound was not known even for the special (classic) case of triangle counting in planar graphs.
△ Less
Submitted 11 November, 2018;
originally announced November 2018.
-
Property Testing of Planarity in the CONGEST model
Authors:
Reut Levi,
Moti Medina,
Dana Ron
Abstract:
We give a distributed algorithm in the {\sf CONGEST} model for property testing of planarity with one-sided error in general (unbounded-degree) graphs. Following Censor-Hillel et al. (DISC 2016), who recently initiated the study of property testing in the distributed setting, our algorithm gives the following guarantee: For a graph $G = (V,E)$ and a distance parameter $ε$, if $G$ is planar, then e…
▽ More
We give a distributed algorithm in the {\sf CONGEST} model for property testing of planarity with one-sided error in general (unbounded-degree) graphs. Following Censor-Hillel et al. (DISC 2016), who recently initiated the study of property testing in the distributed setting, our algorithm gives the following guarantee: For a graph $G = (V,E)$ and a distance parameter $ε$, if $G$ is planar, then every node outputs {\sf accept\/}, and if $G$ is $ε$-far from being planar (i.e., more than $ε\cdot |E|$ edges need to be removed in order to make $G$ planar), then with probability $1-1/{\rm poly}(n)$ at least one node outputs {\sf reject}. The algorithm runs in $O(\log|V|\cdot{\rm poly}(1/ε))$ rounds, and we show that this result is tight in terms of the dependence on $|V|$.
Our algorithm combines several techniques of graph partitioning and local verification of planar embeddings. Furthermore, we show how a main subroutine in our algorithm can be applied to derive additional results for property testing of cycle-freeness and bipartiteness, as well as the construction of spanners, in minor-free (unweighted) graphs.
△ Less
Submitted 14 August, 2019; v1 submitted 27 May, 2018;
originally announced May 2018.
-
Provable and practical approximations for the degree distribution using sublinear graph samples
Authors:
Talya Eden,
Shweta Jain,
Ali Pinar,
Dana Ron,
C. Seshadhri
Abstract:
The degree distribution is one of the most fundamental properties used in the analysis of massive graphs. There is a large literature on graph sampling, where the goal is to estimate properties (especially the degree distribution) of a large graph through a small, random sample. The degree distribution estimation poses a significant challenge, due to its heavy-tailed nature and the large variance…
▽ More
The degree distribution is one of the most fundamental properties used in the analysis of massive graphs. There is a large literature on graph sampling, where the goal is to estimate properties (especially the degree distribution) of a large graph through a small, random sample. The degree distribution estimation poses a significant challenge, due to its heavy-tailed nature and the large variance in degrees.
We design a new algorithm, SADDLES, for this problem, using recent mathematical techniques from the field of sublinear algorithms. The SADDLES algorithm gives provably accurate outputs for all values of the degree distribution. For the analysis, we define two fatness measures of the degree distribution, called the $h$-index and the $z$-index. We prove that SADDLES is sublinear in the graph size when these indices are large. A corollary of this result is a provably sublinear algorithm for any degree distribution bounded below by a power law.
We deploy our new algorithm on a variety of real datasets and demonstrate its excellent empirical behavior. In all instances, we get extremely accurate approximations for all values in the degree distribution by observing at most $1\%$ of the vertices. This is a major improvement over the state-of-the-art sampling algorithms, which typically sample more than $10\%$ of the vertices to give comparable results. We also observe that the $h$ and $z$-indices of real graphs are large, validating our theoretical analysis.
△ Less
Submitted 28 August, 2018; v1 submitted 24 October, 2017;
originally announced October 2017.
-
Testing bounded arboricity
Authors:
Talya Eden,
Reut Levi,
Dana Ron
Abstract:
In this paper we consider the problem of testing whether a graph has bounded arboricity. The family of graphs with bounded arboricity includes, among others, bounded-degree graphs, all minor-closed graph classes (e.g. planar graphs, graphs with bounded treewidth) and randomly generated preferential attachment graphs. Graphs with bounded arboricity have been studied extensively in the past, in part…
▽ More
In this paper we consider the problem of testing whether a graph has bounded arboricity. The family of graphs with bounded arboricity includes, among others, bounded-degree graphs, all minor-closed graph classes (e.g. planar graphs, graphs with bounded treewidth) and randomly generated preferential attachment graphs. Graphs with bounded arboricity have been studied extensively in the past, in particular since for many problems they allow for much more efficient algorithms and/or better approximation ratios.
We present a tolerant tester in the sparse-graphs model. The sparse-graphs model allows access to degree queries and neighbor queries, and the distance is defined with respect to the actual number of edges. More specifically, our algorithm distinguishes between graphs that are $ε$-close to having arboricity $α$ and graphs that $c \cdot ε$-far from having arboricity $3α$, where $c$ is an absolute small constant. The query complexity and running time of the algorithm are $\tilde{O}\left(\frac{n}{\sqrt{m}}\cdot \frac{\log(1/ε)}ε + \frac{n\cdot α}{m} \cdot \left(\frac{1}ε\right)^{O(\log(1/ε))}\right)$ where $n$ denotes the number of vertices and $m$ denotes the number of edges. In terms of the dependence on $n$ and $m$ this bound is optimal up to poly-logarithmic factors since $Ω(n/\sqrt{m})$ queries are necessary (and $α= O(\sqrt{m}))$.
We leave it as an open question whether the dependence on $1/ε$ can be improved from quasi-polynomial to polynomial. Our techniques include an efficient local simulation for approximating the outcome of a global (almost) forest-decomposition algorithm as well as a tailored procedure of edge sampling.
△ Less
Submitted 27 April, 2021; v1 submitted 16 July, 2017;
originally announced July 2017.
-
On Approximating the Number of $k$-cliques in Sublinear Time
Authors:
Talya Eden,
Dana Ron,
C. Seshadhri
Abstract:
We study the problem of approximating the number of $k$-cliques in a graph when given query access to the graph.
We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries and (3) pair queries. Let $n$ denote the number of vertices in the graph, $m$ the number of edges, and $C_k$ the number of $k$-cliques. We design an algorithm that outputs a…
▽ More
We study the problem of approximating the number of $k$-cliques in a graph when given query access to the graph.
We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries and (3) pair queries. Let $n$ denote the number of vertices in the graph, $m$ the number of edges, and $C_k$ the number of $k$-cliques. We design an algorithm that outputs a $(1+\varepsilon)$-approximation (with high probability) for $C_k$, whose expected query complexity and running time are $O\left(\frac{n}{C_k^{1/k}}+\frac{m^{k/2}}{C_k}\right)\poly(\log n,1/\varepsilon,k)$.
Hence, the complexity of the algorithm is sublinear in the size of the graph for $C_k = ω(m^{k/2-1})$. Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on $\log n$, $1/\varepsilon$ and $k$).
The previous results in this vein are by Feige (SICOMP 06) and by Goldreich and Ron (RSA 08) for edge counting ($k=2$) and by Eden et al. (FOCS 2015) for triangle counting ($k=3$). Our result matches the complexities of these results.
The previous result by Eden et al. hinges on a certain amortization technique that works only for triangle counting, and does not generalize for larger cliques. We obtain a general algorithm that works for any $k\geq 3$ by designing a procedure that samples each $k$-clique incident to a given set $S$ of vertices with approximately equal probability. The primary difficulty is in finding cliques incident to purely high-degree vertices, since random sampling within neighbors has a low success probability. This is achieved by an algorithm that samples uniform random high degree vertices and a careful tradeoff between estimating cliques incident purely to high-degree vertices and those that include a low-degree vertex.
△ Less
Submitted 12 March, 2018; v1 submitted 16 July, 2017;
originally announced July 2017.
-
Tolerant Junta Testing and the Connection to Submodular Optimization and Function Isomorphism
Authors:
Eric Blais,
Clément L. Canonne,
Talya Eden,
Amit Levi,
Dana Ron
Abstract:
A function $f\colon \{-1,1\}^n \to \{-1,1\}$ is a $k$-junta if it depends on at most $k$ of its variables. We consider the problem of tolerant testing of $k$-juntas, where the testing algorithm must accept any function that is $ε$-close to some $k$-junta and reject any function that is $ε'$-far from every $k'$-junta for some $ε'= O(ε)$ and $k' = O(k)$.
Our first result is an algorithm that solve…
▽ More
A function $f\colon \{-1,1\}^n \to \{-1,1\}$ is a $k$-junta if it depends on at most $k$ of its variables. We consider the problem of tolerant testing of $k$-juntas, where the testing algorithm must accept any function that is $ε$-close to some $k$-junta and reject any function that is $ε'$-far from every $k'$-junta for some $ε'= O(ε)$ and $k' = O(k)$.
Our first result is an algorithm that solves this problem with query complexity polynomial in $k$ and $1/ε$. This result is obtained via a new polynomial-time approximation algorithm for submodular function minimization (SFM) under large cardinality constraints, which holds even when only given an approximate oracle access to the function.
Our second result considers the case where $k'=k$. We show how to obtain a smooth tradeoff between the amount of tolerance and the query complexity in this setting. Specifically, we design an algorithm that given $ρ\in(0,1/2)$ accepts any function that is $\frac{ερ}{16}$-close to some $k$-junta and rejects any function that is $ε$-far from every $k$-junta. The query complexity of the algorithm is $O\big( \frac{k\log k}{ερ(1-ρ)^k} \big)$.
Finally, we show how to apply the second result to the problem of tolerant isomorphism testing between two unknown Boolean functions $f$ and $g$. We give an algorithm for this problem whose query complexity only depends on the (unknown) smallest $k$ such that either $f$ or $g$ is close to being a $k$-junta.
△ Less
Submitted 3 November, 2016; v1 submitted 13 July, 2016;
originally announced July 2016.
-
A Local Algorithm for Constructing Spanners in Minor-Free Graphs
Authors:
Reut Levi,
Dana Ron,
Ronitt Rubinfeld
Abstract:
Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. We consider this problem in the setting of local algorithms: one wants to quickly determine whether a given edge $e$ is in a specific spanning tree, without computing the whole spanning tree, but rather by inspecting the local neighborhood of $e$. The challenge is to maintain consistency. That is, to answer que…
▽ More
Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. We consider this problem in the setting of local algorithms: one wants to quickly determine whether a given edge $e$ is in a specific spanning tree, without computing the whole spanning tree, but rather by inspecting the local neighborhood of $e$. The challenge is to maintain consistency. That is, to answer queries about different edges according to the same spanning tree. Since it is known that this problem cannot be solved without essentially viewing all the graph, we consider the relaxed version of finding a spanning subgraph with $(1+ε)n$ edges (where $n$ is the number of vertices and $ε$ is a given sparsity parameter). It is known that this relaxed problem requires inspecting $Ω(\sqrt{n})$ edges in general graphs, which motivates the study of natural restricted families of graphs. One such family is the family of graphs with an excluded minor. For this family there is an algorithm that achieves constant success probability, and inspects $(d/ε)^{poly(h)\log(1/ε)}$ edges (for each edge it is queried on), where $d$ is the maximum degree in the graph and $h$ is the size of the excluded minor. The distances between pairs of vertices in the spanning subgraph $G'$ are at most a factor of $poly(d, 1/ε, h)$ larger than in $G$.
In this work, we show that for an input graph that is $H$-minor free for any $H$ of size $h$, this task can be performed by inspecting only $poly(d, 1/ε, h)$ edges. The distances between pairs of vertices in the spanning subgraph $G'$ are at most a factor of $\tilde{O}(h\log(d)/ε)$ larger than in $G$. Furthermore, the error probability of the new algorithm is significantly improved to $Θ(1/n)$. This algorithm can also be easily adapted to yield an efficient algorithm for the distributed setting.
△ Less
Submitted 24 April, 2016;
originally announced April 2016.
-
Sublinear Time Estimation of Degree Distribution Moments: The Degeneracy Connection
Authors:
Talya Eden,
Dana Ron,
C. Seshadhri
Abstract:
We revisit the classic problem of estimating the degree distribution moments of an undirected graph. Consider an undirected graph $G=(V,E)$ with $n$ vertices, and define (for $s > 0$) $μ_s = \frac{1}{n}\cdot\sum_{v \in V} d^s_v$. Our aim is to estimate $μ_s$ within a multiplicative error of $(1+ε)$ (for a given approximation parameter $ε>0$) in sublinear time. We consider the sparse graph model th…
▽ More
We revisit the classic problem of estimating the degree distribution moments of an undirected graph. Consider an undirected graph $G=(V,E)$ with $n$ vertices, and define (for $s > 0$) $μ_s = \frac{1}{n}\cdot\sum_{v \in V} d^s_v$. Our aim is to estimate $μ_s$ within a multiplicative error of $(1+ε)$ (for a given approximation parameter $ε>0$) in sublinear time. We consider the sparse graph model that allows access to: uniform random vertices, queries for the degree of any vertex, and queries for a neighbor of any vertex. For the case of $s=1$ (the average degree), $\widetilde{O}(\sqrt{n})$ queries suffice for any constant $ε$ (Feige, SICOMP 06 and Goldreich-Ron, RSA 08). Gonen-Ron-Shavitt (SIDMA 11) extended this result to all integral $s > 0$, by designing an algorithms that performs $\widetilde{O}(n^{1-1/(s+1)})$ queries.
We design a new, significantly simpler algorithm for this problem. In the worst-case, it exactly matches the bounds of Gonen-Ron-Shavitt, and has a much simpler proof. More importantly, the running time of this algorithm is connected to the degeneracy of $G$. This is (essentially) the maximum density of an induced subgraph. For the family of graphs with degeneracy at most $α$, it has a query complexity of $\widetilde{O}\left(\frac{n^{1-1/s}}{μ^{1/s}_s} \Big(α^{1/s} + \min\{α,μ^{1/s}_s\}\Big)\right) = \widetilde{O}(n^{1-1/s}α/μ^{1/s}_s)$. Thus, for the class of bounded degeneracy graphs (which includes all minor closed families and preferential attachment graphs), we can estimate the average degree in $\widetilde{O}(1)$ queries, and can estimate the variance of the degree distribution in $\widetilde{O}(\sqrt{n})$ queries. This is a major improvement over the previous worst-case bounds. Our key insight is in designing an estimator for $μ_s$ that has low variance when $G$ does not have large dense subgraphs.
△ Less
Submitted 16 February, 2017; v1 submitted 13 April, 2016;
originally announced April 2016.
-
Approximately Counting Triangles in Sublinear Time
Authors:
Talya Eden,
Amit Levi,
Dana Ron,
C. Seshadhri
Abstract:
We consider the problem of estimating the number of triangles in a graph. This problem has been extensively studied in both theory and practice, but all existing algorithms read the entire graph. In this work we design a {\em sublinear-time\/} algorithm for approximating the number of triangles in a graph, where the algorithm is given query access to the graph. The allowed queries are degree queri…
▽ More
We consider the problem of estimating the number of triangles in a graph. This problem has been extensively studied in both theory and practice, but all existing algorithms read the entire graph. In this work we design a {\em sublinear-time\/} algorithm for approximating the number of triangles in a graph, where the algorithm is given query access to the graph. The allowed queries are degree queries, vertex-pair queries and neighbor queries.
We show that for any given approximation parameter $0<ε<1$, the algorithm provides an estimate $\widehat{t}$ such that with high constant probability, $(1-ε)\cdot t< \widehat{t}<(1+ε)\cdot t$, where $t$ is the number of triangles in the graph $G$. The expected query complexity of the algorithm is $\!\left(\frac{n}{t^{1/3}} + \min\left\{m, \frac{m^{3/2}}{t}\right\}\right)\cdot {\rm poly}(\log n, 1/ε)$, where $n$ is the number of vertices in the graph and $m$ is the number of edges, and the expected running time is $\!\left(\frac{n}{t^{1/3}} + \frac{m^{3/2}}{t}\right)\cdot {\rm poly}(\log n, 1/ε)$. We also prove that $Ω\!\left(\frac{n}{t^{1/3}} + \min\left\{m, \frac{m^{3/2}}{t}\right\}\right)$ queries are necessary, thus establishing that the query complexity of this algorithm is optimal up to polylogarithmic factors in $n$ (and the dependence on $1/ε$).
△ Less
Submitted 22 September, 2015; v1 submitted 3 April, 2015;
originally announced April 2015.
-
Constructing Near Spanning Trees with Few Local Inspections
Authors:
Reut Levi,
Guy Moshkovitz,
Dana Ron,
Ronitt Rubinfeld,
Asaf Shapira
Abstract:
Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. Motivated by several recent studies of local graph algorithms, we consider the following variant of this problem. Let G be a connected bounded-degree graph. Given an edge $e$ in $G$ we would like to decide whether $e$ belongs to a connected subgraph $G'$ consisting of $(1+ε)n$ edges (for a prespecified constant…
▽ More
Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. Motivated by several recent studies of local graph algorithms, we consider the following variant of this problem. Let G be a connected bounded-degree graph. Given an edge $e$ in $G$ we would like to decide whether $e$ belongs to a connected subgraph $G'$ consisting of $(1+ε)n$ edges (for a prespecified constant $ε>0$), where the decision for different edges should be consistent with the same subgraph $G'$. Can this task be performed by inspecting only a {\em constant} number of edges in $G$? Our main results are:
(1) We show that if every $t$-vertex subgraph of $G$ has expansion $1/(\log t)^{1+o(1)}$ then one can (deterministically) construct a sparse spanning subgraph $G'$ of $G$ using few inspections. To this end we analyze a "local" version of a famous minimum-weight spanning tree algorithm.
(2) We show that the above expansion requirement is sharp even when allowing randomization. To this end we construct a family of $3$-regular graphs of high girth, in which every $t$-vertex subgraph has expansion $1/(\log t)^{1-o(1)}$.
△ Less
Submitted 3 February, 2015; v1 submitted 2 February, 2015;
originally announced February 2015.
-
Distributed Maximum Matching in Bounded Degree Graphs
Authors:
Guy Even,
Moti Medina,
Dana Ron
Abstract:
We present deterministic distributed algorithms for computing approximate maximum cardinality matchings and approximate maximum weight matchings. Our algorithm for the unweighted case computes a matching whose size is at least $(1-\eps)$ times the optimal in $Δ^{O(1/\eps)} + O\left(\frac{1}{\eps^2}\right) \cdot\log^*(n)$ rounds where $n$ is the number of vertices in the graph and $Δ$ is the maximu…
▽ More
We present deterministic distributed algorithms for computing approximate maximum cardinality matchings and approximate maximum weight matchings. Our algorithm for the unweighted case computes a matching whose size is at least $(1-\eps)$ times the optimal in $Δ^{O(1/\eps)} + O\left(\frac{1}{\eps^2}\right) \cdot\log^*(n)$ rounds where $n$ is the number of vertices in the graph and $Δ$ is the maximum degree. Our algorithm for the edge-weighted case computes a matching whose weight is at least $(1-\eps)$ times the optimal in $\log(\min\{1/\wmin,n/\eps\})^{O(1/\eps)}\cdot(Δ^{O(1/\eps)}+\log^*(n))$ rounds for edge-weights in $[\wmin,1]$.
The best previous algorithms for both the unweighted case and the weighted case are by Lotker, Patt-Shamir, and Pettie~(SPAA 2008). For the unweighted case they give a randomized $(1-\eps)$-approximation algorithm that runs in $O((\log(n)) /\eps^3)$ rounds. For the weighted case they give a randomized $(1/2-\eps)$-approximation algorithm that runs in $O(\log(\eps^{-1}) \cdot \log(n))$ rounds. Hence, our results improve on the previous ones when the parameters $Δ$, $\eps$ and $\wmin$ are constants (where we reduce the number of runs from $O(\log(n))$ to $O(\log^*(n))$), and more generally when $Δ$, $1/\eps$ and $1/\wmin$ are sufficiently slowly increasing functions of $n$. Moreover, our algorithms are deterministic rather than randomized.
△ Less
Submitted 11 November, 2014; v1 submitted 29 July, 2014;
originally announced July 2014.
-
The Power of an Example: Hidden Set Size Approximation Using Group Queries and Conditional Sampling
Authors:
Dana Ron,
Gilad Tsur
Abstract:
We study a basic problem of approximating the size of an unknown set $S$ in a known universe $U$. We consider two versions of the problem. In both versions the algorithm can specify subsets $T\subseteq U$. In the first version, which we refer to as the group query or subset query version, the algorithm is told whether $T\cap S$ is non-empty. In the second version, which we refer to as the subset s…
▽ More
We study a basic problem of approximating the size of an unknown set $S$ in a known universe $U$. We consider two versions of the problem. In both versions the algorithm can specify subsets $T\subseteq U$. In the first version, which we refer to as the group query or subset query version, the algorithm is told whether $T\cap S$ is non-empty. In the second version, which we refer to as the subset sampling version, if $T\cap S$ is non-empty, then the algorithm receives a uniformly selected element from $T\cap S$. We study the difference between these two versions under different conditions on the subsets that the algorithm may query/sample, and in both the case that the algorithm is adaptive and the case where it is non-adaptive. In particular we focus on a natural family of allowed subsets, which correspond to intervals, as well as variants of this family.
△ Less
Submitted 20 April, 2014;
originally announced April 2014.
-
Best of Two Local Models: Local Centralized and Local Distributed Algorithms
Authors:
Guy Even,
Moti Medina,
Dana Ron
Abstract:
We consider two models of computation: centralized local algorithms and local distributed algorithms. Algorithms in one model are adapted to the other model to obtain improved algorithms.
Distributed vertex coloring is employed to design improved centralized local algorithms for: maximal independent set, maximal matching, and an approximation scheme for maximum (weighted) matching over bounded d…
▽ More
We consider two models of computation: centralized local algorithms and local distributed algorithms. Algorithms in one model are adapted to the other model to obtain improved algorithms.
Distributed vertex coloring is employed to design improved centralized local algorithms for: maximal independent set, maximal matching, and an approximation scheme for maximum (weighted) matching over bounded degree graphs. The improvement is threefold: the algorithms are deterministic, stateless, and the number of probes grows polynomially in $\log^* n$, where $n$ is the number of vertices of the input graph.
The recursive centralized local improvement technique by Nguyen and Onak~\cite{onak2008} is employed to obtain an improved distributed approximation scheme for maximum (weighted) matching. The improvement is twofold: we reduce the number of rounds from $O(\log n)$ to $O(\log^*n)$ for a wide range of instances and, our algorithms are deterministic rather than randomized.
△ Less
Submitted 11 November, 2014; v1 submitted 16 February, 2014;
originally announced February 2014.
-
Local Algorithms for Sparse Spanning Graphs
Authors:
Reut Levi,
Dana Ron,
Ronitt Rubinfeld
Abstract:
Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. We consider a relaxed version of this problem in the setting of local algorithms. The relaxation is that the constructed subgraph is a sparse spanning subgraph containing at most $(1+ε)n$ edges (where $n$ is the number of vertices and $ε$ is a given approximation/sparsity parameter). In the local setting, the g…
▽ More
Constructing a spanning tree of a graph is one of the most basic tasks in graph theory. We consider a relaxed version of this problem in the setting of local algorithms. The relaxation is that the constructed subgraph is a sparse spanning subgraph containing at most $(1+ε)n$ edges (where $n$ is the number of vertices and $ε$ is a given approximation/sparsity parameter). In the local setting, the goal is to quickly determine whether a given edge $e$ belongs to such a subgraph, without constructing the whole subgraph, but rather by inspecting (querying) the local neighborhood of $e$. The challenge is to maintain consistency. That is, to provide answers concerning different edges according to the same spanning subgraph.
We first show that for general bounded-degree graphs, the query complexity of any such algorithm must be $Ω(\sqrt{n})$. This lower bound holds for constant-degree graphs that have high expansion. Next we design an algorithm for (bounded-degree) graphs with high expansion, obtaining a result that roughly matches the lower bound. We then turn to study graphs that exclude a fixed minor (and are hence non-expanding). We design an algorithm for such graphs, which may have an unbounded maximum degree. The query complexity of this algorithm is $poly(1/ε, h)$ (independent of $n$ and the maximum degree), where $h$ is the number of vertices in the excluded minor.
Though our two algorithms are designed for very different types of graphs (and have very different complexities), on a high-level there are several similarities, and we highlight both the similarities and the differences.
△ Less
Submitted 27 April, 2021; v1 submitted 14 February, 2014;
originally announced February 2014.
-
A Quasi-Polynomial Time Partition Oracle for Graphs with an Excluded Minor
Authors:
Reut Levi,
Dana Ron
Abstract:
Motivated by the problem of testing planarity and related properties, we study the problem of designing efficient {\em partition oracles}. A {\em partition oracle} is a procedure that, given access to the incidence lists representation of a bounded-degree graph $G= (V,E)$ and a parameter $\eps$, when queried on a vertex $v\in V$, returns the part (subset of vertices) which $v$ belongs to in a part…
▽ More
Motivated by the problem of testing planarity and related properties, we study the problem of designing efficient {\em partition oracles}. A {\em partition oracle} is a procedure that, given access to the incidence lists representation of a bounded-degree graph $G= (V,E)$ and a parameter $\eps$, when queried on a vertex $v\in V$, returns the part (subset of vertices) which $v$ belongs to in a partition of all graph vertices. The partition should be such that all parts are small, each part is connected, and if the graph has certain properties, the total number of edges between parts is at most $\eps |V|$. In this work we give a partition oracle for graphs with excluded minors whose query complexity is quasi-polynomial in $1/\eps$, thus improving on the result of Hassidim et al. ({\em Proceedings of FOCS 2009}) who gave a partition oracle with query complexity exponential in $1/\eps$. This improvement implies corresponding improvements in the complexity of testing planarity and other properties that are characterized by excluded minors as well as sublinear-time approximation algorithms that work under the promise that the graph has an excluded minor.
△ Less
Submitted 14 February, 2013;
originally announced February 2013.
-
A simple online competitive adaptation of Lempel-Ziv compression with efficient random access support
Authors:
Akashnil Dutta,
Reut Levi,
Dana Ron,
Ronitt Rubinfeld
Abstract:
We present a simple adaptation of the Lempel Ziv 78' (LZ78) compression scheme ({\em IEEE Transactions on Information Theory, 1978}) that supports efficient random access to the input string. Namely, given query access to the compressed string, it is possible to efficiently recover any symbol of the input string. The compression algorithm is given as input a parameter $\eps >0$, and with very high…
▽ More
We present a simple adaptation of the Lempel Ziv 78' (LZ78) compression scheme ({\em IEEE Transactions on Information Theory, 1978}) that supports efficient random access to the input string. Namely, given query access to the compressed string, it is possible to efficiently recover any symbol of the input string. The compression algorithm is given as input a parameter $\eps >0$, and with very high probability increases the length of the compressed string by at most a factor of $(1+\eps)$. The access time is $O(\log n + 1/\eps^2)$ in expectation, and $O(\log n/\eps^2)$ with high probability. The scheme relies on sparse transitive-closure spanners. Any (consecutive) substring of the input string can be retrieved at an additional additive cost in the running time of the length of the substring. We also formally establish the necessity of modifying LZ78 so as to allow efficient random access. Specifically, we construct a family of strings for which $Ω(n/\log n)$ queries to the LZ78-compressed string are required in order to recover a single symbol in the input string. The main benefit of the proposed scheme is that it preserves the online nature and simplicity of LZ78, and that for {\em every} input string, the length of the compressed string is only a small factor larger than that obtained by running LZ78.
△ Less
Submitted 11 January, 2013;
originally announced January 2013.
-
Testing probability distributions using conditional samples
Authors:
Clement Canonne,
Dana Ron,
Rocco A. Servedio
Abstract:
We study a new framework for property testing of probability distributions, by considering distribution testing algorithms that have access to a conditional sampling oracle.* This is an oracle that takes as input a subset $S \subseteq [N]$ of the domain $[N]$ of the unknown probability distribution $D$ and returns a draw from the conditional probability distribution $D$ restricted to $S$. This new…
▽ More
We study a new framework for property testing of probability distributions, by considering distribution testing algorithms that have access to a conditional sampling oracle.* This is an oracle that takes as input a subset $S \subseteq [N]$ of the domain $[N]$ of the unknown probability distribution $D$ and returns a draw from the conditional probability distribution $D$ restricted to $S$. This new model allows considerable flexibility in the design of distribution testing algorithms; in particular, testing algorithms in this model can be adaptive.
We study a wide range of natural distribution testing problems in this new framework and some of its variants, giving both upper and lower bounds on query complexity. These problems include testing whether $D$ is the uniform distribution $\mathcal{U}$; testing whether $D = D^\ast$ for an explicitly provided $D^\ast$; testing whether two unknown distributions $D_1$ and $D_2$ are equivalent; and estimating the variation distance between $D$ and the uniform distribution. At a high level our main finding is that the new "conditional sampling" framework we consider is a powerful one: while all the problems mentioned above have $Ω(\sqrt{N})$ sample complexity in the standard model (and in some cases the complexity must be almost linear in $N$), we give $\mathrm{poly}(\log N, 1/\varepsilon)$-query algorithms (and in some cases $\mathrm{poly}(1/\varepsilon)$-query algorithms independent of $N$) for all these problems in our conditional sampling setting.
*Independently from our work, Chakraborty et al. also considered this framework. We discuss their work in Subsection [1.4].
△ Less
Submitted 16 January, 2015; v1 submitted 12 November, 2012;
originally announced November 2012.
-
A Near-Optimal Sublinear-Time Algorithm for Approximating the Minimum Vertex Cover Size
Authors:
Krzysztof Onak,
Dana Ron,
Michal Rosen,
Ronitt Rubinfeld
Abstract:
We give a nearly optimal sublinear-time algorithm for approximating the size of a minimum vertex cover in a graph G. The algorithm may query the degree deg(v) of any vertex v of its choice, and for each 1 <= i <= deg(v), it may ask for the i-th neighbor of v. Letting VC_opt(G) denote the minimum size of vertex cover in G, the algorithm outputs, with high constant success probability, an estimate V…
▽ More
We give a nearly optimal sublinear-time algorithm for approximating the size of a minimum vertex cover in a graph G. The algorithm may query the degree deg(v) of any vertex v of its choice, and for each 1 <= i <= deg(v), it may ask for the i-th neighbor of v. Letting VC_opt(G) denote the minimum size of vertex cover in G, the algorithm outputs, with high constant success probability, an estimate VC_estimate(G) such that VC_opt(G) <= VC_estimate(G) <= 2 * VC_opt(G) + epsilon*n, where epsilon is a given additive approximation parameter. We refer to such an estimate as a (2,epsilon)-estimate. The query complexity and running time of the algorithm are ~O(avg_deg * poly(1/epsilon)), where avg_deg denotes the average vertex degree in the graph. The best previously known sublinear algorithm, of Yoshida et al. (STOC 2009), has query complexity and running time O(d^4/epsilon^2), where d is the maximum degree in the graph. Given the lower bound of Omega(avg_deg) (for constant epsilon) for obtaining such an estimate (with any constant multiplicative factor) due to Parnas and Ron (TCS 2007), our result is nearly optimal.
In the case that the graph is dense, that is, the number of edges is Theta(n^2), we consider another model, in which the algorithm may ask, for any pair of vertices u and v, whether there is an edge between u and v. We show how to adapt the algorithm that uses neighbor queries to this model and obtain an algorithm that outputs a (2,epsilon)-estimate of the size of a minimum vertex cover whose query complexity and running time are ~O(n) * poly(1/epsilon).
△ Less
Submitted 5 October, 2011;
originally announced October 2011.
-
Approximating the Influence of a monotone Boolean function in O(\sqrt{n}) query complexity
Authors:
Dana Ron,
Ronitt Rubinfeld,
Muli Safra,
Omri Weinstein
Abstract:
The {\em Total Influence} ({\em Average Sensitivity) of a discrete function is one of its fundamental measures. We study the problem of approximating the total influence of a monotone Boolean function \ifnum\plusminus=1 $f: \{\pm1\}^n \longrightarrow \{\pm1\}$, \else $f: \bitset^n \to \bitset$, \fi which we denote by $I[f]$. We present a randomized algorithm that approximates the influence of such…
▽ More
The {\em Total Influence} ({\em Average Sensitivity) of a discrete function is one of its fundamental measures. We study the problem of approximating the total influence of a monotone Boolean function \ifnum\plusminus=1 $f: \{\pm1\}^n \longrightarrow \{\pm1\}$, \else $f: \bitset^n \to \bitset$, \fi which we denote by $I[f]$. We present a randomized algorithm that approximates the influence of such functions to within a multiplicative factor of $(1\pm \eps)$ by performing $O(\frac{\sqrt{n}\log n}{I[f]} \poly(1/\eps)) $ queries. % \mnote{D: say something about technique?} We also prove a lower bound of % $Ω(\frac{\sqrt{n/\log n}}{I[f]})$ $Ω(\frac{\sqrt{n}}{\log n \cdot I[f]})$ on the query complexity of any constant-factor approximation algorithm for this problem (which holds for $I[f] = Ω(1)$), % and $I[f] = O(\sqrt{n}/\log n)$), hence showing that our algorithm is almost optimal in terms of its dependence on $n$. For general functions we give a lower bound of $Ω(\frac{n}{I[f]})$, which matches the complexity of a simple sampling algorithm.
△ Less
Submitted 27 January, 2011;
originally announced January 2011.
-
Finding Cycles and Trees in Sublinear Time
Authors:
Artur Czumaj,
Oded Goldreich,
Dana Ron,
C. Seshadhri,
Asaf Shapira,
Christian Sohler
Abstract:
We present sublinear-time (randomized) algorithms for finding simple cycles of length at least $k\geq 3$ and tree-minors in bounded-degree graphs. The complexity of these algorithms is related to the distance of the graph from being $C_k$-minor-free (resp., free from having the corresponding tree-minor). In particular, if the graph is far (i.e., $Ω(1)$-far) {from} being cycle-free, i.e. if one has…
▽ More
We present sublinear-time (randomized) algorithms for finding simple cycles of length at least $k\geq 3$ and tree-minors in bounded-degree graphs. The complexity of these algorithms is related to the distance of the graph from being $C_k$-minor-free (resp., free from having the corresponding tree-minor). In particular, if the graph is far (i.e., $Ω(1)$-far) {from} being cycle-free, i.e. if one has to delete a constant fraction of edges to make it cycle-free, then the algorithm finds a cycle of polylogarithmic length in time $\tildeO(\sqrt{N})$, where $N$ denotes the number of vertices. This time complexity is optimal up to polylogarithmic factors.
The foregoing results are the outcome of our study of the complexity of {\em one-sided error} property testing algorithms in the bounded-degree graphs model. For example, we show that cycle-freeness of $N$-vertex graphs can be tested with one-sided error within time complexity $\tildeO(\poly(1/\e)\cdot\sqrt{N})$. This matches the known $Ω(\sqrt{N})$ query lower bound, and contrasts with the fact that any minor-free property admits a {\em two-sided error} tester of query complexity that only depends on the proximity parameter $\e$. For any constant $k\geq3$, we extend this result to testing whether the input graph has a simple cycle of length at least $k$. On the other hand, for any fixed tree $T$, we show that $T$-minor-freeness has a one-sided error tester of query complexity that only depends on the proximity parameter $\e$.
Our algorithm for finding cycles in bounded-degree graphs extends to general graphs, where distances are measured with respect to the actual number of edges. Such an extension is not possible with respect to finding tree-minors in $o(\sqrt{N})$ complexity.
△ Less
Submitted 3 April, 2012; v1 submitted 23 July, 2010;
originally announced July 2010.
-
Relaxation-based coarsening and multiscale graph organization
Authors:
Dorit Ron,
Ilya Safro,
Achi Brandt
Abstract:
In this paper we generalize and improve the multiscale organization of graphs by introducing a new measure that quantifies the "closeness" between two nodes. The calculation of the measure is linear in the number of edges in the graph and involves just a small number of relaxation sweeps. A similar notion of distance is then calculated and used at each coarser level. We demonstrate the use of this…
▽ More
In this paper we generalize and improve the multiscale organization of graphs by introducing a new measure that quantifies the "closeness" between two nodes. The calculation of the measure is linear in the number of edges in the graph and involves just a small number of relaxation sweeps. A similar notion of distance is then calculated and used at each coarser level. We demonstrate the use of this measure in multiscale methods for several important combinatorial optimization problems and discuss the multiscale graph organization.
△ Less
Submitted 7 April, 2010;
originally announced April 2010.
-
A Fast Multigrid Algorithm for Energy Minimization Under Planar Density Constraints
Authors:
Dorit Ron,
Ilya Safro,
Achi Brandt
Abstract:
The two-dimensional layout optimization problem reinforced by the efficient space utilization demand has a wide spectrum of practical applications. Formulating the problem as a nonlinear minimization problem under planar equality and/or inequality density constraints, we present a linear time multigrid algorithm for solving correction to this problem. The method is demonstrated on various graph…
▽ More
The two-dimensional layout optimization problem reinforced by the efficient space utilization demand has a wide spectrum of practical applications. Formulating the problem as a nonlinear minimization problem under planar equality and/or inequality density constraints, we present a linear time multigrid algorithm for solving correction to this problem. The method is demonstrated on various graph drawing (visualization) instances.
△ Less
Submitted 18 February, 2009;
originally announced February 2009.
-
Sublinear Algorithms for Approximating String Compressibility
Authors:
Sofya Raskhodnikova,
Dana Ron,
Ronitt Rubinfeld,
Adam Smith
Abstract:
We raise the question of approximating the compressibility of a string with respect to a fixed compression scheme, in sublinear time. We study this question in detail for two popular lossless compression schemes: run-length encoding (RLE) and Lempel-Ziv (LZ), and present sublinear algorithms for approximating compressibility with respect to both schemes. We also give several lower bounds that sh…
▽ More
We raise the question of approximating the compressibility of a string with respect to a fixed compression scheme, in sublinear time. We study this question in detail for two popular lossless compression schemes: run-length encoding (RLE) and Lempel-Ziv (LZ), and present sublinear algorithms for approximating compressibility with respect to both schemes. We also give several lower bounds that show that our algorithms for both schemes cannot be improved significantly.
Our investigation of LZ yields results whose interest goes beyond the initial questions we set out to study. In particular, we prove combinatorial structural lemmas that relate the compressibility of a string with respect to Lempel-Ziv to the number of distinct short substrings contained in it. In addition, we show that approximating the compressibility with respect to LZ is related to approximating the support size of a distribution.
△ Less
Submitted 7 June, 2007;
originally announced June 2007.