Learn Locally, Correct Globally: A Distributed Algorithm for Training Graph Neural Networks

Ramezani, Morteza; Cong, Weilin; Mahdavi, Mehrdad; Kandemir, Mahmut T.; Sivasubramaniam, Anand

Computer Science > Machine Learning

arXiv:2111.08202 (cs)

[Submitted on 16 Nov 2021 (v1), last revised 13 Mar 2022 (this version, v4)]

Title:Learn Locally, Correct Globally: A Distributed Algorithm for Training Graph Neural Networks

Authors:Morteza Ramezani, Weilin Cong, Mehrdad Mahdavi, Mahmut T. Kandemir, Anand Sivasubramaniam

View PDF

Abstract:Despite the recent success of Graph Neural Networks (GNNs), training GNNs on large graphs remains challenging. The limited resource capacities of the existing servers, the dependency between nodes in a graph, and the privacy concern due to the centralized storage and model learning have spurred the need to design an effective distributed algorithm for GNN training. However, existing distributed GNN training methods impose either excessive communication costs or large memory overheads that hinders their scalability. To overcome these issues, we propose a communication-efficient distributed GNN training technique named $\text{Learn Locally, Correct Globally}$ (LLCG). To reduce the communication and memory overhead, each local machine in LLCG first trains a GNN on its local data by ignoring the dependency between nodes among different machines, then sends the locally trained model to the server for periodic model averaging. However, ignoring node dependency could result in significant performance degradation. To solve the performance degradation, we propose to apply $\text{Global Server Corrections}$ on the server to refine the locally learned models. We rigorously analyze the convergence of distributed methods with periodic model averaging for training GNNs and show that naively applying periodic model averaging but ignoring the dependency between nodes will suffer from an irreducible residual error. However, this residual error can be eliminated by utilizing the proposed global corrections to entail fast convergence rate. Extensive experiments on real-world datasets show that LLCG can significantly improve the efficiency without hurting the performance.

Comments:	The Tenth International Conference on Learning Representations (ICLR 2022)
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2111.08202 [cs.LG]
	(or arXiv:2111.08202v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2111.08202

Submission history

From: Morteza Ramezani [view email]
[v1] Tue, 16 Nov 2021 03:07:01 UTC (1,293 KB)
[v2] Wed, 17 Nov 2021 17:06:22 UTC (1,251 KB)
[v3] Tue, 7 Dec 2021 16:55:20 UTC (1,252 KB)
[v4] Sun, 13 Mar 2022 14:51:02 UTC (1,246 KB)

Computer Science > Machine Learning

Title:Learn Locally, Correct Globally: A Distributed Algorithm for Training Graph Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learn Locally, Correct Globally: A Distributed Algorithm for Training Graph Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators