-
An embedding-based distance for temporal graphs
Authors:
Lorenzo Dall'Amico,
Alain Barrat,
Ciro Cattuto
Abstract:
Temporal graphs are commonly used to represent time-resolved relations between entities in many natural and artificial systems. Many techniques were devised to investigate the evolution of temporal graphs by comparing their state at different time points. However, quantifying the similarity between temporal graphs as a whole is an open problem. Here, we use embeddings based on time-respecting rand…
▽ More
Temporal graphs are commonly used to represent time-resolved relations between entities in many natural and artificial systems. Many techniques were devised to investigate the evolution of temporal graphs by comparing their state at different time points. However, quantifying the similarity between temporal graphs as a whole is an open problem. Here, we use embeddings based on time-respecting random walks to introduce a new notion of distance between temporal graphs. This distance is well-defined for pairs of temporal graphs with different numbers of nodes and different time spans. We study the case of a matched pair of graphs, when a known relation exists between their nodes, and the case of unmatched graphs, when such a relation is unavailable and the graphs may be of different sizes. We use empirical and synthetic temporal network data to show that the distance we introduce discriminates graphs with different topological and temporal properties. We provide an efficient implementation of the distance computation suitable for large-scale temporal graphs.
△ Less
Submitted 19 November, 2024; v1 submitted 23 January, 2024;
originally announced January 2024.
-
Political Context of the European Vaccine Debate on Twitter
Authors:
Giordano Paoletti,
Lorenzo Dall'Amico,
Kyriaki Kalimeri,
Jacopo Lenti,
Yelena Mejova,
Daniela Paolotti,
Michele Starnini,
Michele Tizzani
Abstract:
At the beginning of the COVID-19 pandemic, fears grew that making vaccination a political (instead of public health) issue may impact the efficacy of this life-saving intervention, spurring the spread of vaccine-hesitant content. In this study, we examine whether there is a relationship between the political interest of social media users and their exposure to vaccine-hesitant content on Twitter.…
▽ More
At the beginning of the COVID-19 pandemic, fears grew that making vaccination a political (instead of public health) issue may impact the efficacy of this life-saving intervention, spurring the spread of vaccine-hesitant content. In this study, we examine whether there is a relationship between the political interest of social media users and their exposure to vaccine-hesitant content on Twitter. We focus on 17 European countries using a multilingual, longitudinal dataset of tweets spanning the period before COVID, up to the vaccine roll-out. We find that, in most countries, users' endorsement of vaccine-hesitant content is the highest in the early months of the pandemic, around the time of greatest scientific uncertainty. Further, users who follow politicians from right-wing parties, and those associated with authoritarian or anti-EU stances are more likely to endorse vaccine-hesitant content, whereas those following left-wing politicians, more pro-EU or liberal parties, are less likely. Somewhat surprisingly, politicians did not play an outsized role in the vaccine debates of their countries, receiving a similar number of retweets as other similarly popular users. This systematic, multi-country, longitudinal investigation of the connection of politics with vaccine hesitancy has important implications for public health policy and communication.
△ Less
Submitted 1 March, 2024; v1 submitted 6 September, 2023;
originally announced September 2023.
-
Learning distributed representations with efficient SoftMax normalization
Authors:
Lorenzo Dall'Amico,
Enrico Maria Belliardo
Abstract:
Learning distributed representations, or embeddings, that encode the relational similarity patterns among objects is a relevant task in machine learning. A popular method to learn the embedding matrices $X, Y$ is optimizing a loss function of the term ${\rm SoftMax}(XY^T)$. The complexity required to calculate this term, however, runs quadratically with the problem size, making it a computationall…
▽ More
Learning distributed representations, or embeddings, that encode the relational similarity patterns among objects is a relevant task in machine learning. A popular method to learn the embedding matrices $X, Y$ is optimizing a loss function of the term ${\rm SoftMax}(XY^T)$. The complexity required to calculate this term, however, runs quadratically with the problem size, making it a computationally heavy solution. In this article, we propose a linear-time heuristic approximation to compute the normalization constants of ${\rm SoftMax}(XY^T)$ for embedding vectors with bounded norms. We show on some pre-trained embedding datasets that the proposed estimation method achieves higher or comparable accuracy with competing methods. From this result, we design an efficient and task-agnostic algorithm that learns the embeddings by optimizing the cross entropy between the softmax and a set of probability distributions given as inputs. The proposed algorithm is interpretable and easily adapted to arbitrary embedding problems. We consider a few use cases and observe similar or higher performances and a lower computational time than similar ``2Vec'' algorithms.
△ Less
Submitted 1 June, 2025; v1 submitted 30 March, 2023;
originally announced March 2023.
-
Nishimori meets Bethe: a spectral method for node classification in sparse weighted graphs
Authors:
Lorenzo Dall'Amico,
Romain Couillet,
Nicolas Tremblay
Abstract:
This article unveils a new relation between the Nishimori temperature parametrizing a distribution P and the Bethe free energy on random Erdos-Renyi graphs with edge weights distributed according to P. Estimating the Nishimori temperature being a task of major importance in Bayesian inference problems, as a practical corollary of this new relation, a numerical method is proposed to accurately esti…
▽ More
This article unveils a new relation between the Nishimori temperature parametrizing a distribution P and the Bethe free energy on random Erdos-Renyi graphs with edge weights distributed according to P. Estimating the Nishimori temperature being a task of major importance in Bayesian inference problems, as a practical corollary of this new relation, a numerical method is proposed to accurately estimate the Nishimori temperature from the eigenvalues of the Bethe Hessian matrix of the weighted graph. The algorithm, in turn, is used to propose a new spectral method for node classification in weighted (possibly sparse) graphs. The superiority of the method over competing state-of-the-art approaches is demonstrated both through theoretical arguments and real-world data experiments.
△ Less
Submitted 24 September, 2021; v1 submitted 5 March, 2021;
originally announced March 2021.
-
Community detection in sparse time-evolving graphs with a dynamical Bethe-Hessian
Authors:
Lorenzo Dall'Amico,
Romain Couillet,
Nicolas Tremblay
Abstract:
This article considers the problem of community detection in sparse dynamical graphs in which the community structure evolves over time. A fast spectral algorithm based on an extension of the Bethe-Hessian matrix is proposed, which benefits from the positive correlation in the class labels and in their temporal evolution and is designed to be applicable to any dynamical graph with a community stru…
▽ More
This article considers the problem of community detection in sparse dynamical graphs in which the community structure evolves over time. A fast spectral algorithm based on an extension of the Bethe-Hessian matrix is proposed, which benefits from the positive correlation in the class labels and in their temporal evolution and is designed to be applicable to any dynamical graph with a community structure. Under the dynamical degree-corrected stochastic block model, in the case of two classes of equal size, we demonstrate and support with extensive simulations that our proposed algorithm is capable of making non-trivial community reconstruction as soon as theoretically possible, thereby reaching the optimal detectability threshold and provably outperforming competing spectral methods.
△ Less
Submitted 26 October, 2020; v1 submitted 3 June, 2020;
originally announced June 2020.
-
A unified framework for spectral clustering in sparse graphs
Authors:
Lorenzo Dall'Amico,
Romain Couillet,
Nicolas Tremblay
Abstract:
This article considers spectral community detection in the regime of sparse networks with heterogeneous degree distributions, for which we devise an algorithm to efficiently retrieve communities. Specifically, we demonstrate that a conveniently parametrized form of regularized Laplacian matrix can be used to perform spectral clustering in sparse networks, without suffering from its degree heteroge…
▽ More
This article considers spectral community detection in the regime of sparse networks with heterogeneous degree distributions, for which we devise an algorithm to efficiently retrieve communities. Specifically, we demonstrate that a conveniently parametrized form of regularized Laplacian matrix can be used to perform spectral clustering in sparse networks, without suffering from its degree heterogeneity. Besides, we exhibit important connections between this proposed matrix and the now popular non-backtracking matrix, the Bethe-Hessian matrix, as well as the standard Laplacian matrix. Interestingly, as opposed to competitive methods, our proposed improved parametrization inherently accounts for the hardness of the classification problem. These findings are summarized under the form of an algorithm capable of both estimating the number of communities and achieving high-quality community reconstruction.
△ Less
Submitted 10 October, 2021; v1 submitted 20 March, 2020;
originally announced March 2020.
-
Optimal Laplacian regularization for sparse spectral community detection
Authors:
Lorenzo Dall'Amico,
Romain Couillet,
Nicolas Tremblay
Abstract:
Regularization of the classical Laplacian matrices was empirically shown to improve spectral clustering in sparse networks. It was observed that small regularizations are preferable, but this point was left as a heuristic argument. In this paper we formally determine a proper regularization which is intimately related to alternative state-of-the-art spectral techniques for sparse graphs.
Regularization of the classical Laplacian matrices was empirically shown to improve spectral clustering in sparse networks. It was observed that small regularizations are preferable, but this point was left as a heuristic argument. In this paper we formally determine a proper regularization which is intimately related to alternative state-of-the-art spectral techniques for sparse graphs.
△ Less
Submitted 15 May, 2020; v1 submitted 3 December, 2019;
originally announced December 2019.
-
Revisiting the Bethe-Hessian: Improved Community Detection in Sparse Heterogeneous Graphs
Authors:
Lorenzo Dall'Amico,
Romain Couillet,
Nicolas Tremblay
Abstract:
Spectral clustering is one of the most popular, yet still incompletely understood, methods for community detection on graphs. This article studies spectral clustering based on the Bethe-Hessian matrix $H_r = (r^2-1)I_n + D-rA$ for sparse heterogeneous graphs (following the degree-corrected stochastic block model) in a two-class setting. For a specific value $r = ζ$, clustering is shown to be insen…
▽ More
Spectral clustering is one of the most popular, yet still incompletely understood, methods for community detection on graphs. This article studies spectral clustering based on the Bethe-Hessian matrix $H_r = (r^2-1)I_n + D-rA$ for sparse heterogeneous graphs (following the degree-corrected stochastic block model) in a two-class setting. For a specific value $r = ζ$, clustering is shown to be insensitive to the degree heterogeneity. We then study the behavior of the informative eigenvector of $H_ζ$ and, as a result, predict the clustering accuracy. The article concludes with an overview of the generalization to more than two classes along with extensive simulations on synthetic and real networks corroborating our findings.
△ Less
Submitted 9 October, 2019; v1 submitted 25 January, 2019;
originally announced January 2019.