Distributed Graph Clustering by Load Balancing

Sun, He; Zanetti, Luca

Computer Science > Data Structures and Algorithms

arXiv:1607.04984v1 (cs)

A newer version of this paper has been withdrawn by He Sun

[Submitted on 18 Jul 2016 (this version), latest version 11 Apr 2019 (v3)]

Title:Distributed Graph Clustering by Load Balancing

Authors:He Sun, Luca Zanetti

View PDF

Abstract:Graph clustering is a fundamental problem with a number of applications in algorithm design, machine learning, data mining, and analysis of social networks. Over the past decades, researchers have proposed a number of algorithmic design methods for graph clustering. However, most of these methods are based on complicated spectral techniques or convex optimization, and cannot be applied directly for clustering most real-world networks, whose information is often collected on different sites. Designing a simple clustering algorithm that works in the distributed setting is of important interest, and has wide applications for processing big datasets.
In this paper we present a simple and distributed algorithm for graph clustering: for a wide class of graphs with n nodes that can be characterized by a well-defined cluster-structure, our algorithm finishes in a poly-logarithmic number of rounds, and recovers a partition of the graph with at most o(n) misclassified nodes. The main component of our algorithm is an application of the random matching model of load balancing, which is a fundamental protocol in distributed computing and has been extensively studied in the past 20 years. Hence, our result highlights an intrinsic and interesting connection between graph clustering and load balancing.

Subjects:	Data Structures and Algorithms (cs.DS); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
ACM classes:	F.2.0
Cite as:	arXiv:1607.04984 [cs.DS]
	(or arXiv:1607.04984v1 [cs.DS] for this version)
	https://doi.org/10.48550/arXiv.1607.04984

Submission history

From: Luca Zanetti [view email]
[v1] Mon, 18 Jul 2016 09:30:49 UTC (23 KB)
[v2] Mon, 5 Jun 2017 20:36:55 UTC (26 KB)
[v3] Thu, 11 Apr 2019 13:47:40 UTC (1 KB) (withdrawn)

Computer Science > Data Structures and Algorithms

Title:Distributed Graph Clustering by Load Balancing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Data Structures and Algorithms

Title:Distributed Graph Clustering by Load Balancing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators