GOSH: Embedding Big Graphs on Small Hardware

Akyildiz, Taha Atahan; Aljundi, Amro Alabsi; Kaya, Kamer

doi:10.1145/3404397.3404456

Computer Science > Distributed, Parallel, and Cluster Computing

arXiv:2008.12336 (cs)

[Submitted on 27 Aug 2020 (v1), last revised 31 Aug 2020 (this version, v2)]

Title:GOSH: Embedding Big Graphs on Small Hardware

Authors:Taha Atahan Akyildiz, Amro Alabsi Aljundi, Kamer Kaya

View PDF

Abstract:In graph embedding, the connectivity information of a graph is used to represent each vertex as a point in a d-dimensional space. Unlike the original, irregular structural information, such a representation can be used for a multitude of machine learning tasks. Although the process is extremely useful in practice, it is indeed expensive and unfortunately, the graphs are becoming larger and harder to embed. Attempts at scaling up the process to larger graphs have been successful but often at a steep price in hardware requirements. We present GOSH, an approach for embedding graphs of arbitrary sizes on a single GPU with minimum constraints. GOSH utilizes a novel graph coarsening approach to compress the graph and minimize the work required for embedding, delivering high-quality embeddings at a fraction of the time compared to the state-of-the-art. In addition to this, it incorporates a decomposition schema that enables any arbitrarily large graph to be embedded using a single GPU with minimum constraints on the memory size. With these techniques, GOSH is able to embed a graph with over 65 million vertices and 1.8 billion edges in less than an hour on a single GPU and obtains a 93% AUCROC for link-prediction which can be increased to 95% by running the tool for 80 minutes.

Comments:	11 pages, 4 figures, published in ICPP 2020: The 49th International Conference on Parallel Processing - Edmonton, AB, Canada
Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:2008.12336 [cs.DC]
	(or arXiv:2008.12336v2 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.2008.12336
Related DOI:	https://doi.org/10.1145/3404397.3404456

Submission history

From: Amro Alabsi Aljundi [view email]
[v1] Thu, 27 Aug 2020 18:53:32 UTC (505 KB)
[v2] Mon, 31 Aug 2020 09:38:24 UTC (505 KB)

Computer Science > Distributed, Parallel, and Cluster Computing

Title:GOSH: Embedding Big Graphs on Small Hardware

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Distributed, Parallel, and Cluster Computing

Title:GOSH: Embedding Big Graphs on Small Hardware

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators