Reinforcement Learning Enhanced Weighted Sampling for Accurate Subgraph Counting on Fully Dynamic Graph Streams

Wang, Kaixin; Long, Cheng; Yan, Da; Zhang, Jie; Jagadish, H. V.

Computer Science > Databases

arXiv:2211.06793 (cs)

[Submitted on 13 Nov 2022]

Title:Reinforcement Learning Enhanced Weighted Sampling for Accurate Subgraph Counting on Fully Dynamic Graph Streams

Authors:Kaixin Wang, Cheng Long, Da Yan, Jie Zhang, H. V. Jagadish

View PDF

Abstract:As the popularity of graph data increases, there is a growing need to count the occurrences of subgraph patterns of interest, for a variety of applications. Many graphs are massive in scale and also fully dynamic (with insertions and deletions of edges), rendering exact computation of these counts to be infeasible. Common practice is, instead, to use a small set of edges as a sample to estimate the counts. Existing sampling algorithms for fully dynamic graphs sample the edges with uniform probability. In this paper, we show that we can do much better if we sample edges based on their individual properties. Specifically, we propose a weighted sampling algorithm called WSD for estimating the subgraph count in a fully dynamic graph stream, which samples the edges based on their weights that indicate their importance and reflect their properties. We determine the weights of edges in a data-driven fashion, using a novel method based on reinforcement learning. We conduct extensive experiments to verify that our technique can produce estimates with smaller errors while often running faster compared with existing algorithms.

Comments:	17 pages, 5 figures. Accepted by ICDE'23
Subjects:	Databases (cs.DB); Machine Learning (cs.LG)
Cite as:	arXiv:2211.06793 [cs.DB]
	(or arXiv:2211.06793v1 [cs.DB] for this version)
	https://doi.org/10.48550/arXiv.2211.06793

Submission history

From: Kaixin Wang [view email]
[v1] Sun, 13 Nov 2022 03:01:34 UTC (5,617 KB)

Computer Science > Databases

Title:Reinforcement Learning Enhanced Weighted Sampling for Accurate Subgraph Counting on Fully Dynamic Graph Streams

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Databases

Title:Reinforcement Learning Enhanced Weighted Sampling for Accurate Subgraph Counting on Fully Dynamic Graph Streams

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators