Efficient Iterative Programs with Distributed Data Collections
Authors:
Sarah Chlyah,
Nils Gesbert,
Pierre Geneves,
Nabil Layaida
Abstract:
Big data programming frameworks have become increasingly important
for the development of applications for which performance and
scalability are critical. In those complex frameworks, optimizing
code by hand is hard and time-consuming, making automated
optimization particularly necessary. In order to automate
optimization, a prerequisite is to find suitable abstractions to
represent pr…
▽ More
Big data programming frameworks have become increasingly important
for the development of applications for which performance and
scalability are critical. In those complex frameworks, optimizing
code by hand is hard and time-consuming, making automated
optimization particularly necessary. In order to automate
optimization, a prerequisite is to find suitable abstractions to
represent programs; for instance, algebras based on monads or
monoids to represent distributed data collections. Currently,
however, such algebras do not represent recursive programs in a way
which allows for analyzing or rewriting them. In this paper, we extend a
monoid algebra with a fixpoint operator for representing recursion
as a first class citizen and show how it enables new optimizations.
Experiments with the Spark platform illustrate performance gains
brought by these systematic optimizations.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
Knowledge Enhanced Graph Neural Networks for Graph Completion
Authors:
Luisa Werner,
Nabil Layaïda,
Pierre Genevès,
Sarah Chlyah
Abstract:
Graph data is omnipresent and has a wide variety of applications, such as in natural science, social networks, or the semantic web. However, while being rich in information, graphs are often noisy and incomplete. As a result, graph completion tasks, such as node classification or link prediction, have gained attention. On one hand, neural methods, such as graph neural networks, have proven to be…
▽ More
Graph data is omnipresent and has a wide variety of applications, such as in natural science, social networks, or the semantic web. However, while being rich in information, graphs are often noisy and incomplete. As a result, graph completion tasks, such as node classification or link prediction, have gained attention. On one hand, neural methods, such as graph neural networks, have proven to be robust tools for learning rich representations of noisy graphs. On the other hand, symbolic methods enable exact reasoning on graphs.We propose Knowledge Enhanced Graph Neural Networks (KeGNN), a neuro-symbolic framework for graph completion that combines both paradigms as it allows for the integration of prior knowledge into a graph neural network model.Essentially, KeGNN consists of a graph neural network as a base upon which knowledge enhancement layers are stacked with the goal of refining predictions with respect to prior knowledge.We instantiate KeGNN in conjunction with two state-of-the-art graph neural networks, Graph Convolutional Networks and Graph Attention Networks, and evaluate KeGNN on multiple benchmark datasets for node classification.
△ Less
Submitted 31 August, 2023; v1 submitted 27 March, 2023;
originally announced March 2023.
Distributed Evaluation of Graph Queries using Recursive Relational Algebra
Authors:
Sarah Chlyah,
Pierre Genevès,
Nabil Layaïda
Abstract:
We present a system called Dist-$μ$-RA for the distributed evaluation of recursive graph queries. Dist-$μ$-RA builds on the recursive relational algebra and extends it with evaluation plans suited for the distributed setting. The goal is to offer expressivity for high-level queries while providing efficiency at scale and reducing communication costs. Experimental results on both real and synthetic…
▽ More
We present a system called Dist-$μ$-RA for the distributed evaluation of recursive graph queries. Dist-$μ$-RA builds on the recursive relational algebra and extends it with evaluation plans suited for the distributed setting. The goal is to offer expressivity for high-level queries while providing efficiency at scale and reducing communication costs. Experimental results on both real and synthetic graphs show the effectiveness of the proposed approach compared to existing systems.
△ Less
Submitted 31 March, 2025; v1 submitted 24 November, 2021;
originally announced November 2021.