-
Weighted Random Dot Product Graphs
Authors:
Bernardo Marenco,
Paola Bermolen,
Marcelo Fiori,
Federico Larroca,
Gonzalo Mateos
Abstract:
Modeling of intricate relational patterns has become a cornerstone of contemporary statistical research and related data science fields. Networks, represented as graphs, offer a natural framework for this analysis. This paper extends the Random Dot Product Graph (RDPG) model to accommodate weighted graphs, markedly broadening the model's scope to scenarios where edges exhibit heterogeneous weight…
▽ More
Modeling of intricate relational patterns has become a cornerstone of contemporary statistical research and related data science fields. Networks, represented as graphs, offer a natural framework for this analysis. This paper extends the Random Dot Product Graph (RDPG) model to accommodate weighted graphs, markedly broadening the model's scope to scenarios where edges exhibit heterogeneous weight distributions. We propose a nonparametric weighted (W)RDPG model that assigns a sequence of latent positions to each node. Inner products of these nodal vectors specify the moments of their incident edge weights' distribution via moment-generating functions. In this way, and unlike prior art, the WRDPG can discriminate between weight distributions that share the same mean but differ in other higher-order moments. We derive statistical guarantees for an estimator of the nodal's latent positions adapted from the workhorse adjacency spectral embedding, establishing its consistency and asymptotic normality. We also contribute a generative framework that enables sampling of graphs that adhere to a (prescribed or data-fitted) WRDPG, facilitating, e.g., the analysis and testing of observed graph metrics using judicious reference distributions. The paper is organized to formalize the model's definition, the estimation (or nodal embedding) process and its guarantees, as well as the methodologies for generating weighted graphs, all complemented by illustrative and reproducible examples showcasing the WRDPG's effectiveness in various network analytic applications.
△ Less
Submitted 6 May, 2025; v1 submitted 6 May, 2025;
originally announced May 2025.
-
Gradient-Based Spectral Embeddings of Random Dot Product Graphs
Authors:
Marcelo Fiori,
Bernardo Marenco,
Federico Larroca,
Paola Bermolen,
Gonzalo Mateos
Abstract:
The Random Dot Product Graph (RDPG) is a generative model for relational data, where nodes are represented via latent vectors in low-dimensional Euclidean space. RDPGs crucially postulate that edge formation probabilities are given by the dot product of the corresponding latent positions. Accordingly, the embedding task of estimating these vectors from an observed graph is typically posed as a low…
▽ More
The Random Dot Product Graph (RDPG) is a generative model for relational data, where nodes are represented via latent vectors in low-dimensional Euclidean space. RDPGs crucially postulate that edge formation probabilities are given by the dot product of the corresponding latent positions. Accordingly, the embedding task of estimating these vectors from an observed graph is typically posed as a low-rank matrix factorization problem. The workhorse Adjacency Spectral Embedding (ASE) enjoys solid statistical properties, but it is formally solving a surrogate problem and can be computationally intensive. In this paper, we bring to bear recent advances in non-convex optimization and demonstrate their impact to RDPG inference. We advocate first-order gradient descent methods to better solve the embedding problem, and to organically accommodate broader network embedding applications of practical relevance. Notably, we argue that RDPG embeddings of directed graphs loose interpretability unless the factor matrices are constrained to have orthogonal columns. We thus develop a novel feasible optimization method in the resulting manifold. The effectiveness of the graph representation learning framework is demonstrated on reproducible experiments with both synthetic and real network data. Our open-source algorithm implementations are scalable, and unlike the ASE they are robust to missing edge data and can track slowly-varying latent positions from streaming graphs.
△ Less
Submitted 8 December, 2023; v1 submitted 25 July, 2023;
originally announced July 2023.
-
Online Topology Inference from Streaming Stationary Graph Signals with Partial Connectivity Information
Authors:
Rasoul Shafipour,
Gonzalo Mateos
Abstract:
We develop online graph learning algorithms from streaming network data. Our goal is to track the (possibly) time-varying network topology, and effect memory and computational savings by processing the data on-the-fly as they are acquired. The setup entails observations modeled as stationary graph signals generated by local diffusion dynamics on the unknown network. Moreover, we may have a priori…
▽ More
We develop online graph learning algorithms from streaming network data. Our goal is to track the (possibly) time-varying network topology, and effect memory and computational savings by processing the data on-the-fly as they are acquired. The setup entails observations modeled as stationary graph signals generated by local diffusion dynamics on the unknown network. Moreover, we may have a priori information on the presence or absence of a few edges as in the link prediction problem. The stationarity assumption implies that the observations' covariance matrix and the so-called graph shift operator (GSO -- a matrix encoding the graph topology) commute under mild requirements. This motivates formulating the topology inference task as an inverse problem, whereby one searches for a sparse GSO that is structurally admissible and approximately commutes with the observations' empirical covariance matrix. For streaming data said covariance can be updated recursively, and we show online proximal gradient iterations can be brought to bear to efficiently track the time-varying solution of the inverse problem with quantifiable guarantees. Specifically, we derive conditions under which the GSO recovery cost is strongly convex and use this property to prove that the online algorithm converges to within a neighborhood of the optimal time-varying batch solution. Numerical tests illustrate the effectiveness of the proposed graph learning approach in adapting to streaming information and tracking changes in the sought dynamic network.
△ Less
Submitted 7 July, 2020;
originally announced July 2020.
-
A Digraph Fourier Transform With Spread Frequency Components
Authors:
Rasoul Shafipour,
Ali Khodabakhsh,
Gonzalo Mateos,
Evdokia Nikolova
Abstract:
We study the problem of constructing a graph Fourier transform (GFT) for directed graphs (digraphs), which decomposes graph signals into different modes of variation with respect to the underlying network. Accordingly, to capture low, medium and high frequencies we seek a digraph (D)GFT such that the orthonormal frequency components are as spread as possible in the graph spectral domain. This spec…
▽ More
We study the problem of constructing a graph Fourier transform (GFT) for directed graphs (digraphs), which decomposes graph signals into different modes of variation with respect to the underlying network. Accordingly, to capture low, medium and high frequencies we seek a digraph (D)GFT such that the orthonormal frequency components are as spread as possible in the graph spectral domain. This specification gives rise to a challenging nonconvex optimization problem, so we resort to a simple yet efficient heuristic to construct the DGFT basis from Laplacian eigenvectors of an undirected version of the digraph. To select frequency components which are as spread as possible, we define a spectral dispersion function and show that it is supermodular. Moreover, we show that orthonormality can be enforced via a matroid basis constraint, which motivates adopting a scalable greedy algorithm to obtain an approximate solution with provable performance guarantee. The effectiveness of the novel DGFT is illustrated through numerical tests on synthetic and real-world graphs.
△ Less
Submitted 30 May, 2017;
originally announced May 2017.
-
Decentralized learning for wireless communications and networking
Authors:
Georgios B. Giannakis,
Qing Ling,
Gonzalo Mateos,
Ioannis D. Schizas,
Hao Zhu
Abstract:
This chapter deals with decentralized learning algorithms for in-network processing of graph-valued data. A generic learning problem is formulated and recast into a separable form, which is iteratively minimized using the alternating-direction method of multipliers (ADMM) so as to gain the desired degree of parallelization. Without exchanging elements from the distributed training sets and keeping…
▽ More
This chapter deals with decentralized learning algorithms for in-network processing of graph-valued data. A generic learning problem is formulated and recast into a separable form, which is iteratively minimized using the alternating-direction method of multipliers (ADMM) so as to gain the desired degree of parallelization. Without exchanging elements from the distributed training sets and keeping inter-node communications at affordable levels, the local (per-node) learners consent to the desired quantity inferred globally, meaning the one obtained if the entire training data set were centrally available. Impact of the decentralized learning framework to contemporary wireless communications and networking tasks is illustrated through case studies including target tracking using wireless sensor networks, unveiling Internet traffic anomalies, power system state estimation, as well as spectrum cartography for wireless cognitive radio networks.
△ Less
Submitted 30 March, 2015;
originally announced March 2015.
-
Load curve data cleansing and imputation via sparsity and low rank
Authors:
Gonzalo Mateos,
Georgios B. Giannakis
Abstract:
The smart grid vision is to build an intelligent power network with an unprecedented level of situational awareness and controllability over its services and infrastructure. This paper advocates statistical inference methods to robustify power monitoring tasks against the outlier effects owing to faulty readings and malicious attacks, as well as against missing data due to privacy concerns and com…
▽ More
The smart grid vision is to build an intelligent power network with an unprecedented level of situational awareness and controllability over its services and infrastructure. This paper advocates statistical inference methods to robustify power monitoring tasks against the outlier effects owing to faulty readings and malicious attacks, as well as against missing data due to privacy concerns and communication errors. In this context, a novel load cleansing and imputation scheme is developed leveraging the low intrinsic-dimensionality of spatiotemporal load profiles and the sparse nature of "bad data.'' A robust estimator based on principal components pursuit (PCP) is adopted, which effects a twofold sparsity-promoting regularization through an $\ell_1$-norm of the outliers, and the nuclear norm of the nominal load profiles. Upon recasting the non-separable nuclear norm into a form amenable to decentralized optimization, a distributed (D-) PCP algorithm is developed to carry out the imputation and cleansing tasks using networked devices comprising the so-termed advanced metering infrastructure. If D-PCP converges and a qualification inequality is satisfied, the novel distributed estimator provably attains the performance of its centralized PCP counterpart, which has access to all networkwide data. Computer simulations and tests with real load curve data corroborate the convergence and effectiveness of the novel D-PCP algorithm.
△ Less
Submitted 31 January, 2013;
originally announced January 2013.
-
Distributed Recursive Least-Squares: Stability and Performance Analysis
Authors:
Gonzalo Mateos,
Georgios B. Giannakis
Abstract:
The recursive least-squares (RLS) algorithm has well-documented merits for reducing complexity and storage requirements, when it comes to online estimation of stationary signals as well as for tracking slowly-varying nonstationary processes. In this paper, a distributed recursive least-squares (D-RLS) algorithm is developed for cooperative estimation using ad hoc wireless sensor networks. Distribu…
▽ More
The recursive least-squares (RLS) algorithm has well-documented merits for reducing complexity and storage requirements, when it comes to online estimation of stationary signals as well as for tracking slowly-varying nonstationary processes. In this paper, a distributed recursive least-squares (D-RLS) algorithm is developed for cooperative estimation using ad hoc wireless sensor networks. Distributed iterations are obtained by minimizing a separable reformulation of the exponentially-weighted least-squares cost, using the alternating-minimization algorithm. Sensors carry out reduced-complexity tasks locally, and exchange messages with one-hop neighbors to consent on the network-wide estimates adaptively. A steady-state mean-square error (MSE) performance analysis of D-RLS is conducted, by studying a stochastically-driven `averaged' system that approximates the D-RLS dynamics asymptotically in time. For sensor observations that are linearly related to the time-invariant parameter vector sought, the simplifying independence setting assumptions facilitate deriving accurate closed-form expressions for the MSE steady-state values. The problems of mean- and MSE-sense stability of D-RLS are also investigated, and easily-checkable sufficient conditions are derived under which a steady-state is attained. Without resorting to diminishing step-sizes which compromise the tracking ability of D-RLS, stability ensures that per sensor estimates hover inside a ball of finite radius centered at the true parameter vector, with high-probability, even when inter-sensor communication links are noisy. Interestingly, computer simulations demonstrate that the theoretical findings are accurate also in the pragmatic settings whereby sensors acquire temporally-correlated data.
△ Less
Submitted 20 September, 2011;
originally announced September 2011.