A Reachability Index for Recursive Label-Concatenated Graph Queries
Authors:
Chao Zhang,
Angela Bonifati,
Hugo Kapp,
Vlad Ioan Haprian,
Jean-Pierre Lozi
Abstract:
Reachability queries checking the existence of a path from a source node to a target node are fundamental operators for querying and processing graph data. Current approaches for index-based evaluation of reachability queries either focus on plain reachability or constraint-based reachability with alternation only. In this paper, for the first time we study the problem of index-based processing fo…
▽ More
Reachability queries checking the existence of a path from a source node to a target node are fundamental operators for querying and processing graph data. Current approaches for index-based evaluation of reachability queries either focus on plain reachability or constraint-based reachability with alternation only. In this paper, for the first time we study the problem of index-based processing for recursive label-concatenated reachability queries, referred to as RLC queries. These queries check the existence of a path that can satisfy the constraint defined by a concatenation of at most k edge labels under the Kleene plus. Many practical graph database and network analysis applications exhibit RLC queries. However, their evaluation remains prohibitive in current graph database engines.
We introduce the RLC index, the first reachability index to efficiently process RLC queries. The RLC index checks whether the source vertex can reach an intermediate vertex that can also reach the target vertex under a recursive label-concatenated constraint. We propose an indexing algorithm to build the RLC index, which guarantees the soundness and the completeness of query execution and avoids recording redundant index entries. Comprehensive experiments on real-world graphs show that the RLC index can significantly reduce both the offline processing cost and the memory overhead of transitive closure while improving query processing up to six orders of magnitude over online traversals. Finally, our open-source implementation of the RLC index significantly outperforms current mainstream graph engines for evaluating RLC queries.
△ Less
Submitted 20 July, 2022; v1 submitted 16 March, 2022;
originally announced March 2022.
The LDBC Social Network Benchmark
Authors:
Renzo Angles,
János Benjamin Antal,
Alex Averbuch,
Altan Birler,
Peter Boncz,
Márton Búr,
Orri Erling,
Andrey Gubichev,
Vlad Haprian,
Moritz Kaufmann,
Josep Lluís Larriba Pey,
Norbert Martínez,
József Marton,
Marcus Paradies,
Minh-Duc Pham,
Arnau Prat-Pérez,
David Püroja,
Mirko Spasić,
Benjamin A. Steer,
Dávid Szakállas,
Gábor Szárnyas,
Jack Waudby,
Mingxi Wu,
Yuchen Zhang
Abstract:
The Linked Data Benchmark Council's Social Network Benchmark (LDBC SNB) is an effort intended to test various functionalities of systems used for graph-like data management. For this, LDBC SNB uses the recognizable scenario of operating a social network, characterized by its graph-shaped data. LDBC SNB consists of two workloads that focus on different functionalities: the Interactive workload (int…
▽ More
The Linked Data Benchmark Council's Social Network Benchmark (LDBC SNB) is an effort intended to test various functionalities of systems used for graph-like data management. For this, LDBC SNB uses the recognizable scenario of operating a social network, characterized by its graph-shaped data. LDBC SNB consists of two workloads that focus on different functionalities: the Interactive workload (interactive transactional queries) and the Business Intelligence workload (analytical queries). This document contains the definition of both workloads. This includes a detailed explanation of the data used in the LDBC SNB, a detailed description for all queries, and instructions on how to generate the data and run the benchmark with the provided software.
△ Less
Submitted 7 September, 2024; v1 submitted 7 January, 2020;
originally announced January 2020.