Shortest Path Centrality and the APSP problem via VC-dimension and Rademacher Averages
Authors:
Alane M. de Lima,
Murilo V. G. da Silva,
André L. Vignatti
Abstract:
In this paper we are interested in a version of the All-pairs Shortest Paths problem (APSP) that fits neither in the exact nor in the approximate case. We define a measure of centrality of a shortest path, related to the ``importance'' of such shortest path in the graph, and propose an algorithm based on the idea of progressive sampling that, for {\it any fixed constants} $0 < ε$, $ δ< 1$, given a…
▽ More
In this paper we are interested in a version of the All-pairs Shortest Paths problem (APSP) that fits neither in the exact nor in the approximate case. We define a measure of centrality of a shortest path, related to the ``importance'' of such shortest path in the graph, and propose an algorithm based on the idea of progressive sampling that, for {\it any fixed constants} $0 < ε$, $ δ< 1$, given an undirected graph $G$ with non-negative edge weights, outputs with probability $1 - δ$ a data structure of size $n \cdot \textrm{Diam}_V(G)$, where $\textrm{Diam}_V(G)$ is the vertex diameter of $G$, in expected time $\mathcal{O}(\lg n \max(m + n \log n, n \cdot \textrm{Diam}_V(G)))$ containing the (exact) distance and the shortest path between every pair of vertices $(u,v)$ that has centrality at least $ε$. The progressive sampling technique is sensitive to the probability distribution of the input (if we assume that $G$ is chosen from a prescribed random distribution), but even in the case where we take no assumption about such distribution, we show an upper bound for the sample size using VC-dimension theory that is tighter than the bound given by standard Hoeffding and union bounds, since VC-dimension theory captures the combinatorial structure of the input graph.
△ Less
Submitted 4 May, 2020; v1 submitted 29 November, 2019;
originally announced November 2019.
Estimating the Percolation Centrality of Large Networks through Pseudo-dimension Theory
Authors:
Alane M. de Lima,
Murilo V. G. da Silva,
André L. Vignatti
Abstract:
In this work we investigate the problem of estimating the percolation centrality of every vertex in a graph. This centrality measure quantifies the importance of each vertex in a graph going through a contagious process. It is an open problem whether the percolation centrality can be computed in $\mathcal{O}(n^{3-c})$ time, for any constant $c>0$. In this paper we present a…
▽ More
In this work we investigate the problem of estimating the percolation centrality of every vertex in a graph. This centrality measure quantifies the importance of each vertex in a graph going through a contagious process. It is an open problem whether the percolation centrality can be computed in $\mathcal{O}(n^{3-c})$ time, for any constant $c>0$. In this paper we present a $\mathcal{O}(m \log^2 n)$ randomized approximation algorithm for the percolation centrality for every vertex of $G$, generalizing techniques developed by Riondato, Upfal e Kornaropoulos (this complexity is reduced to $\mathcal{O}((m+n) \log n)$ for unweighted graphs). The estimation obtained by the algorithm is within $ε$ of the exact value with probability $1-δ$, for {\it fixed} constants $0 < ε,δ\leq 1$. In fact, we show in our experimental analysis that in the case of real world complex networks, the output produced by our algorithm is significantly closer to the exact values than its guarantee in terms of theoretical worst case analysis.
△ Less
Submitted 15 February, 2020; v1 submitted 1 October, 2019;
originally announced October 2019.