Exact Distributed Stochastic Block Partitioning

Wanye, Frank; Gleyzer, Vitaliy; Kao, Edward; Feng, Wu-chun

Abstract:Stochastic block partitioning (SBP) is a community detection algorithm that is highly accurate even on graphs with a complex community structure, but its inherently serial nature hinders its widespread adoption by the wider scientific community. To make it practical to analyze large real-world graphs with SBP, there is a growing need to parallelize and distribute the algorithm. The current state-of-the-art distributed SBP algorithm is a divide-and-conquer approach that limits communication between compute nodes until the end of inference. This leads to the breaking of computational dependencies, which causes convergence issues as the number of compute nodes increases, and when the graph is sufficiently sparse. In this paper, we introduce EDiSt - an exact distributed stochastic block partitioning algorithm. Under EDiSt, compute nodes periodically share community assignments during inference. Due to this additional communication, EDiSt improves upon the divide-and-conquer algorithm by allowing it to scale out to a larger number of compute nodes without suffering from convergence issues, even on sparse graphs. We show that EDiSt provides speedups of up to 23.8X over the divide-and-conquer approach, and speedups up to 38.0X over shared memory parallel SBP when scaled out to 64 compute nodes.

Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:2305.18663 [cs.DC]
	(or arXiv:2305.18663v1 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.2305.18663

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Exact Distributed Stochastic Block Partitioning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators