Search | arXiv e-print repository

Graph Neural Network Generalization with Gaussian Mixture Model Based Augmentation

Authors: Yassine Abbahaddou, Fragkiskos D. Malliaros, Johannes F. Lutzeyer, Amine Mohamed Aboussalah, Michalis Vazirgiannis

Abstract: Graph Neural Networks (GNNs) have shown great promise in tasks like node and graph classification, but they often struggle to generalize, particularly to unseen or out-of-distribution (OOD) data. These challenges are exacerbated when training data is limited in size or diversity. To address these issues, we introduce a theoretical framework using Rademacher complexity to compute a regret bound on… ▽ More Graph Neural Networks (GNNs) have shown great promise in tasks like node and graph classification, but they often struggle to generalize, particularly to unseen or out-of-distribution (OOD) data. These challenges are exacerbated when training data is limited in size or diversity. To address these issues, we introduce a theoretical framework using Rademacher complexity to compute a regret bound on the generalization error and then characterize the effect of data augmentation. This framework informs the design of GRATIN, an efficient graph data augmentation algorithm leveraging the capability of Gaussian Mixture Models (GMMs) to approximate any distribution. Our approach not only outperforms existing augmentation techniques in terms of generalization but also offers improved time complexity, making it highly suitable for real-world applications. △ Less

Submitted 6 June, 2025; v1 submitted 13 November, 2024; originally announced November 2024.

Journal ref: International Conference on Machine Learning (ICML) 2025

arXiv:2411.05399 [pdf, other]

Post-Hoc Robustness Enhancement in Graph Neural Networks with Conditional Random Fields

Authors: Yassine Abbahaddou, Sofiane Ennadir, Johannes F. Lutzeyer, Fragkiskos D. Malliaros, Michalis Vazirgiannis

Abstract: Graph Neural Networks (GNNs), which are nowadays the benchmark approach in graph representation learning, have been shown to be vulnerable to adversarial attacks, raising concerns about their real-world applicability. While existing defense techniques primarily concentrate on the training phase of GNNs, involving adjustments to message passing architectures or pre-processing methods, there is a no… ▽ More Graph Neural Networks (GNNs), which are nowadays the benchmark approach in graph representation learning, have been shown to be vulnerable to adversarial attacks, raising concerns about their real-world applicability. While existing defense techniques primarily concentrate on the training phase of GNNs, involving adjustments to message passing architectures or pre-processing methods, there is a noticeable gap in methods focusing on increasing robustness during inference. In this context, this study introduces RobustCRF, a post-hoc approach aiming to enhance the robustness of GNNs at the inference stage. Our proposed method, founded on statistical relational learning using a Conditional Random Field, is model-agnostic and does not require prior knowledge about the underlying model architecture. We validate the efficacy of this approach across various models, leveraging benchmark node classification datasets. △ Less

Submitted 8 November, 2024; originally announced November 2024.

arXiv:2411.04655 [pdf, other]

Centrality Graph Shift Operators for Graph Neural Networks

Authors: Yassine Abbahaddou, Fragkiskos D. Malliaros, Johannes F. Lutzeyer, Michalis Vazirgiannis

Abstract: Graph Shift Operators (GSOs), such as the adjacency and graph Laplacian matrices, play a fundamental role in graph theory and graph representation learning. Traditional GSOs are typically constructed by normalizing the adjacency matrix by the degree matrix, a local centrality metric. In this work, we instead propose and study Centrality GSOs (CGSOs), which normalize adjacency matrices by global ce… ▽ More Graph Shift Operators (GSOs), such as the adjacency and graph Laplacian matrices, play a fundamental role in graph theory and graph representation learning. Traditional GSOs are typically constructed by normalizing the adjacency matrix by the degree matrix, a local centrality metric. In this work, we instead propose and study Centrality GSOs (CGSOs), which normalize adjacency matrices by global centrality metrics such as the PageRank, $k$-core or count of fixed length walks. We study spectral properties of the CGSOs, allowing us to get an understanding of their action on graph signals. We confirm this understanding by defining and running the spectral clustering algorithm based on different CGSOs on several synthetic and real-world datasets. We furthermore outline how our CGSO can act as the message passing operator in any Graph Neural Network and in particular demonstrate strong performance of a variant of the Graph Convolutional Network and Graph Attention Network using our CGSOs on several real-world benchmark datasets. △ Less

Submitted 7 November, 2024; originally announced November 2024.

arXiv:2211.08972 [pdf, other]

New Frontiers in Graph Autoencoders: Joint Community Detection and Link Prediction

Authors: Guillaume Salha-Galvan, Johannes F. Lutzeyer, George Dasoulas, Romain Hennequin, Michalis Vazirgiannis

Abstract: Graph autoencoders (GAE) and variational graph autoencoders (VGAE) emerged as powerful methods for link prediction (LP). Their performances are less impressive on community detection (CD), where they are often outperformed by simpler alternatives such as the Louvain method. It is still unclear to what extent one can improve CD with GAE and VGAE, especially in the absence of node features. It is mo… ▽ More Graph autoencoders (GAE) and variational graph autoencoders (VGAE) emerged as powerful methods for link prediction (LP). Their performances are less impressive on community detection (CD), where they are often outperformed by simpler alternatives such as the Louvain method. It is still unclear to what extent one can improve CD with GAE and VGAE, especially in the absence of node features. It is moreover uncertain whether one could do so while simultaneously preserving good performances on LP in a multi-task setting. In this workshop paper, summarizing results from our journal publication (Salha-Galvan et al. 2022), we show that jointly addressing these two tasks with high accuracy is possible. For this purpose, we introduce a community-preserving message passing scheme, doping our GAE and VGAE encoders by considering both the initial graph and Louvain-based prior communities when computing embedding spaces. Inspired by modularity-based clustering, we further propose novel training and optimization strategies specifically designed for joint LP and CD. We demonstrate the empirical effectiveness of our approach, referred to as Modularity-Aware GAE and VGAE, on various real-world graphs. △ Less

Submitted 16 November, 2022; originally announced November 2022.

Comments: This NeurIPS 2022 GLFrontiers workshop paper summarizes results from the following journal article: arXiv:2202.00961. arXiv admin note: text overlap with arXiv:2205.14651

arXiv:2211.04248 [pdf, other]

Improving Graph Neural Networks at Scale: Combining Approximate PageRank and CoreRank

Authors: Ariel R. Ramos Vela, Johannes F. Lutzeyer, Anastasios Giovanidis, Michalis Vazirgiannis

Abstract: Graph Neural Networks (GNNs) have achieved great successes in many learning tasks performed on graph structures. Nonetheless, to propagate information GNNs rely on a message passing scheme which can become prohibitively expensive when working with industrial-scale graphs. Inspired by the PPRGo model, we propose the CorePPR model, a scalable solution that utilises a learnable convex combination of… ▽ More Graph Neural Networks (GNNs) have achieved great successes in many learning tasks performed on graph structures. Nonetheless, to propagate information GNNs rely on a message passing scheme which can become prohibitively expensive when working with industrial-scale graphs. Inspired by the PPRGo model, we propose the CorePPR model, a scalable solution that utilises a learnable convex combination of the approximate personalised PageRank and the CoreRank to diffuse multi-hop neighbourhood information in GNNs. Additionally, we incorporate a dynamic mechanism to select the most influential neighbours for a particular node which reduces training time while preserving the performance of the model. Overall, we demonstrate that CorePPR outperforms PPRGo, particularly on large graphs where selecting the most influential nodes is particularly relevant for scalability. Our code is publicly available at: https://github.com/arielramos97/CorePPR. △ Less

Submitted 8 November, 2022; originally announced November 2022.

Comments: Accepted at the "NeurIPS 2022 New Frontiers in Graph Learning Workshop (NeurIPS GLFrontiers 2022)"

arXiv:2202.00961 [pdf, other]

Modularity-Aware Graph Autoencoders for Joint Community Detection and Link Prediction

Authors: Guillaume Salha-Galvan, Johannes F. Lutzeyer, George Dasoulas, Romain Hennequin, Michalis Vazirgiannis

Abstract: Graph autoencoders (GAE) and variational graph autoencoders (VGAE) emerged as powerful methods for link prediction. Their performances are less impressive on community detection problems where, according to recent and concurring experimental evaluations, they are often outperformed by simpler alternatives such as the Louvain method. It is currently still unclear to which extent one can improve com… ▽ More Graph autoencoders (GAE) and variational graph autoencoders (VGAE) emerged as powerful methods for link prediction. Their performances are less impressive on community detection problems where, according to recent and concurring experimental evaluations, they are often outperformed by simpler alternatives such as the Louvain method. It is currently still unclear to which extent one can improve community detection with GAE and VGAE, especially in the absence of node features. It is moreover uncertain whether one could do so while simultaneously preserving good performances on link prediction. In this paper, we show that jointly addressing these two tasks with high accuracy is possible. For this purpose, we introduce and theoretically study a community-preserving message passing scheme, doping our GAE and VGAE encoders by considering both the initial graph structure and modularity-based prior communities when computing embedding spaces. We also propose novel training and optimization strategies, including the introduction of a modularity-inspired regularizer complementing the existing reconstruction losses for joint link prediction and community detection. We demonstrate the empirical effectiveness of our approach, referred to as Modularity-Aware GAE and VGAE, through in-depth experimental validation on various real-world graphs. △ Less

Submitted 20 June, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

Comments: Accepted for publication in Elsevier's Neural Networks journal in 2022

arXiv:2109.01785 [pdf, other]

Node Feature Kernels Increase Graph Convolutional Network Robustness

Authors: Mohamed El Amine Seddik, Changmin Wu, Johannes F. Lutzeyer, Michalis Vazirgiannis

Abstract: The robustness of the much-used Graph Convolutional Networks (GCNs) to perturbations of their input is becoming a topic of increasing importance. In this paper, the random GCN is introduced for which a random matrix theory analysis is possible. This analysis suggests that if the graph is sufficiently perturbed, or in the extreme case random, then the GCN fails to benefit from the node features. It… ▽ More The robustness of the much-used Graph Convolutional Networks (GCNs) to perturbations of their input is becoming a topic of increasing importance. In this paper, the random GCN is introduced for which a random matrix theory analysis is possible. This analysis suggests that if the graph is sufficiently perturbed, or in the extreme case random, then the GCN fails to benefit from the node features. It is furthermore observed that enhancing the message passing step in GCNs by adding the node feature kernel to the adjacency matrix of the graph structure solves this problem. An empirical study of a GCN utilised for node classification on six real datasets further confirms the theoretical findings and demonstrates that perturbations of the graph structure can result in GCNs performing significantly worse than Multi-Layer Perceptrons run on the node features alone. In practice, adding a node feature kernel to the message passing of perturbed graphs results in a significant improvement of the GCN's performance, thereby rendering it more robust to graph perturbations. Our code is publicly available at:https://github.com/ChangminWu/RobustGCN. △ Less

Submitted 21 February, 2022; v1 submitted 4 September, 2021; originally announced September 2021.

Comments: 16 pages, 5 figures

arXiv:1712.03769 [pdf, ps, other]

doi 10.1007/978-3-030-36683-4_16

Comparing Graph Spectra of Adjacency and Laplacian Matrices

Authors: J. F. Lutzeyer, A. T. Walden

Abstract: Typically, graph structures are represented by one of three different matrices: the adjacency matrix, the unnormalised and the normalised graph Laplacian matrices. The spectral (eigenvalue) properties of these different matrices are compared. For each pair, the comparison is made by applying an affine transformation to one of them, which enables comparison whilst preserving certain key properties… ▽ More Typically, graph structures are represented by one of three different matrices: the adjacency matrix, the unnormalised and the normalised graph Laplacian matrices. The spectral (eigenvalue) properties of these different matrices are compared. For each pair, the comparison is made by applying an affine transformation to one of them, which enables comparison whilst preserving certain key properties such as normalised eigengaps. Bounds are given on the eigenvalue differences thus found, which depend on the minimum and maximum degree of the graph. The monotonicity of the bounds and the structure of the graphs are related. The bounds on a real social network graph, and on three model graphs, are illustrated and analysed. The methodology is extended to provide bounds on normalised eigengap differences which again turn out to be in terms of the graph's degree extremes. It is found that if the degree extreme difference is large, different choices of representation matrix may give rise to disparate inference drawn from graph signal processing algorithms; smaller degree extreme differences result in consistent inference, whatever the choice of representation matrix. The different inference drawn from signal processing algorithms is visualised using the spectral clustering algorithm on the three representation matrices corresponding to a model graph and a real social network graph. △ Less

Submitted 11 December, 2017; originally announced December 2017.

Showing 1–8 of 8 results for author: Lutzeyer, J F