-
A Survey on Embedding Dynamic Graphs
Authors:
Claudio D. T. Barros,
Matheus R. F. Mendonça,
Alex B. Vieira,
Artur Ziviani
Abstract:
Embedding static graphs in low-dimensional vector spaces plays a key role in network analytics and inference, supporting applications like node classification, link prediction, and graph visualization. However, many real-world networks present dynamic behavior, including topological evolution, feature evolution, and diffusion. Therefore, several methods for embedding dynamic graphs have been propo…
▽ More
Embedding static graphs in low-dimensional vector spaces plays a key role in network analytics and inference, supporting applications like node classification, link prediction, and graph visualization. However, many real-world networks present dynamic behavior, including topological evolution, feature evolution, and diffusion. Therefore, several methods for embedding dynamic graphs have been proposed to learn network representations over time, facing novel challenges, such as time-domain modeling, temporal features to be captured, and the temporal granularity to be embedded. In this survey, we overview dynamic graph embedding, discussing its fundamentals and the recent advances developed so far. We introduce the formal definition of dynamic graph embedding, focusing on the problem setting and introducing a novel taxonomy for dynamic graph embedding input and output. We further explore different dynamic behaviors that may be encompassed by embeddings, classifying by topological evolution, feature evolution, and processes on networks. Afterward, we describe existing techniques and propose a taxonomy for dynamic graph embedding techniques based on algorithmic approaches, from matrix and tensor factorization to deep learning, random walks, and temporal point processes. We also elucidate main applications, including dynamic link prediction, anomaly detection, and diffusion prediction, and we further state some promising research directions in the area.
△ Less
Submitted 21 July, 2021; v1 submitted 4 January, 2021;
originally announced January 2021.
-
Efficient Information Diffusion in Time-Varying Graphs through Deep Reinforcement Learning
Authors:
Matheus R. F. Mendonça,
André M. S. Barreto,
Artur Ziviani
Abstract:
Network seeding for efficient information diffusion over time-varying graphs~(TVGs) is a challenging task with many real-world applications. There are several ways to model this spatio-temporal influence maximization problem, but the ultimate goal is to determine the best moment for a node to start the diffusion process. In this context, we propose Spatio-Temporal Influence Maximization~(STIM), a…
▽ More
Network seeding for efficient information diffusion over time-varying graphs~(TVGs) is a challenging task with many real-world applications. There are several ways to model this spatio-temporal influence maximization problem, but the ultimate goal is to determine the best moment for a node to start the diffusion process. In this context, we propose Spatio-Temporal Influence Maximization~(STIM), a model trained with Reinforcement Learning and Graph Embedding over a set of artificial TVGs that is capable of learning the temporal behavior and connectivity pattern of each node, allowing it to predict the best moment to start a diffusion through the TVG. We also develop a special set of artificial TVGs used for training that simulate a stochastic diffusion process in TVGs, showing that the STIM network can learn an efficient policy even over a non-deterministic environment. STIM is also evaluated with a real-world TVG, where it also manages to efficiently propagate information through the nodes. Finally, we also show that the STIM model has a time complexity of $O(|E|)$. STIM, therefore, presents a novel approach for efficient information diffusion in TVGs, being highly versatile, where one can change the goal of the model by simply changing the adopted reward function.
△ Less
Submitted 26 November, 2020;
originally announced November 2020.
-
Emergence of complex data from simple local rules in a network game
Authors:
Felipe S. Abrahão,
Klaus Wehmuth,
Artur Ziviani
Abstract:
As one of the main subjects of investigation in data science, network science has been demonstrated a wide range of applications to real-world networks analysis and modeling. For example, the pervasive presence of structural or topological characteristics, such as the small-world phenomenon, small-diameter, scale-free properties, or fat-tailed degree distribution were one of the underlying pillars…
▽ More
As one of the main subjects of investigation in data science, network science has been demonstrated a wide range of applications to real-world networks analysis and modeling. For example, the pervasive presence of structural or topological characteristics, such as the small-world phenomenon, small-diameter, scale-free properties, or fat-tailed degree distribution were one of the underlying pillars fostering the study of complex networks. Relating these phenomena with other emergent properties in complex systems became a subject of central importance. By introducing new implications on the interface between data science and complex systems science with the purpose of tackling some of these issues, in this article we present a model for a network game played by complex networks in which nodes are computable systems. In particular, we present and discuss how some network topological properties and simple local communication rules are able to generate a phase transition with respect to the emergence of incompressible data.
△ Less
Submitted 23 September, 2020;
originally announced September 2020.
-
An Algorithmic Information Distortion in Multidimensional Networks
Authors:
Felipe S. Abrahão,
Klaus Wehmuth,
Hector Zenil,
Artur Ziviani
Abstract:
Network complexity, network information content analysis, and lossless compressibility of graph representations have been played an important role in network analysis and network modeling. As multidimensional networks, such as time-varying, multilayer, or dynamic multilayer networks, gain more relevancy in network science, it becomes crucial to investigate in which situations universal algorithmic…
▽ More
Network complexity, network information content analysis, and lossless compressibility of graph representations have been played an important role in network analysis and network modeling. As multidimensional networks, such as time-varying, multilayer, or dynamic multilayer networks, gain more relevancy in network science, it becomes crucial to investigate in which situations universal algorithmic methods based on algorithmic information theory applied to graphs cannot be straightforwardly imported into the multidimensional case. In this direction, as a worst-case scenario of lossless compressibility distortion that increases linearly with the number of distinct dimensions, this article presents a counter-intuitive phenomenon that occurs when dealing with networks within non-uniform and sufficiently large multidimensional spaces. In particular, we demonstrate that the algorithmic information necessary to encode multidimensional networks that are isomorphic to logarithmically compressible monoplex networks may display exponentially larger distortions in the general case.
△ Less
Submitted 5 October, 2020; v1 submitted 12 September, 2020;
originally announced September 2020.
-
On the existence of hidden machines in computational time hierarchies
Authors:
Felipe S. Abrahão,
Klaus Wehmuth,
Artur Ziviani
Abstract:
Challenging the standard notion of totality in computable functions, one has that, given any sufficiently expressive formal axiomatic system, there are total functions that, although computable and "intuitively" understood as being total, cannot be proved to be total. In this article we show that this implies the existence of an infinite hierarchy of time complexity classes whose representative me…
▽ More
Challenging the standard notion of totality in computable functions, one has that, given any sufficiently expressive formal axiomatic system, there are total functions that, although computable and "intuitively" understood as being total, cannot be proved to be total. In this article we show that this implies the existence of an infinite hierarchy of time complexity classes whose representative members are hidden from (or unknown by) the respective formal axiomatic systems. Although these classes contain total computable functions, there are some of these functions for which the formal axiomatic system cannot recognize as belonging to a time complexity class. This leads to incompleteness results regarding formalizations of computational complexity.
△ Less
Submitted 2 September, 2020;
originally announced September 2020.
-
Approximating Network Centrality Measures Using Node Embedding and Machine Learning
Authors:
Matheus R. F. Mendonça,
André M. S. Barreto,
Artur Ziviani
Abstract:
Extracting information from real-world large networks is a key challenge nowadays. For instance, computing a node centrality may become unfeasible depending on the intended centrality due to its computational cost. One solution is to develop fast methods capable of approximating network centralities. Here, we propose an approach for efficiently approximating node centralities for large networks us…
▽ More
Extracting information from real-world large networks is a key challenge nowadays. For instance, computing a node centrality may become unfeasible depending on the intended centrality due to its computational cost. One solution is to develop fast methods capable of approximating network centralities. Here, we propose an approach for efficiently approximating node centralities for large networks using Neural Networks and Graph Embedding techniques. Our proposed model, entitled Network Centrality Approximation using Graph Embedding (NCA-GE), uses the adjacency matrix of a graph and a set of features for each node (here, we use only the degree) as input and computes the approximate desired centrality rank for every node. NCA-GE has a time complexity of $O(|E|)$, $E$ being the set of edges of a graph, making it suitable for large networks. NCA-GE also trains pretty fast, requiring only a set of a thousand small synthetic scale-free graphs (ranging from 100 to 1000 nodes each), and it works well for different node centralities, network sizes, and topologies. Finally, we compare our approach to the state-of-the-art method that approximates centrality ranks using the degree and eigenvector centralities as input, where we show that the NCA-GE outperforms the former in a variety of scenarios.
△ Less
Submitted 1 November, 2020; v1 submitted 29 June, 2020;
originally announced June 2020.
-
You Shall not Pass: Avoiding Spurious Paths in Shortest-Path Based Centralities in Multidimensional Complex Networks
Authors:
Klaus Wehmuth,
Artur Ziviani,
Leonardo Chinelate Costa,
Ana Paula Couto da Silva,
Alex Borges Vieira
Abstract:
In complex network analysis, centralities based on shortest paths, such as betweenness and closeness, are widely used. More recently, many complex systems are being represented by time-varying, multilayer, and time-varying multilayer networks, i.e. multidimensional (or high order) networks. Nevertheless, it is well-known that the aggregation process may create spurious paths on the aggregated view…
▽ More
In complex network analysis, centralities based on shortest paths, such as betweenness and closeness, are widely used. More recently, many complex systems are being represented by time-varying, multilayer, and time-varying multilayer networks, i.e. multidimensional (or high order) networks. Nevertheless, it is well-known that the aggregation process may create spurious paths on the aggregated view of such multidimensional (high order) networks. Consequently, these spurious paths may then cause shortest-path based centrality metrics to produce incorrect results, thus undermining the network centrality analysis. In this context, we propose a method able to avoid taking into account spurious paths when computing centralities based on shortest paths in multidimensional (or high order) networks. Our method is based on MultiAspect Graphs~(MAG) to represent the multidimensional networks and we show that well-known centrality algorithms can be straightforwardly adapted to the MAG environment. Moreover, we show that, by using this MAG representation, pitfalls usually associated with spurious paths resulting from aggregation in multidimensional networks can be avoided at the time of the aggregation process. As a result, shortest-path based centralities are assured to be computed correctly for multidimensional networks, without taking into account spurious paths that could otherwise lead to incorrect results. We also present a case study that shows the impact of spurious paths in the computing of shortest paths and consequently of shortest-path based centralities, such as betweenness and closeness, thus illustrating the importance of this contribution.
△ Less
Submitted 19 August, 2020; v1 submitted 27 June, 2020;
originally announced June 2020.
-
DJEnsemble: On the Selection of a Disjoint Ensemble of Deep Learning Black-Box Spatio-Temporal Models
Authors:
Yania Molina Souto,
Rafael Pereira,
Rocío Zorrilla,
Anderson Chaves,
Brian Tsan,
Florin Rusu,
Eduardo Ogasawara,
Artur Ziviani,
Fabio Porto
Abstract:
In this paper, we present a cost-based approach for the automatic selection and allocation of a disjoint ensemble of black-box predictors to answer predictive spatio-temporal queries. Our approach is divided into two parts -- offline and online. During the offline part, we preprocess the predictive domain data -- transforming it into a regular grid -- and the black-box models -- computing their sp…
▽ More
In this paper, we present a cost-based approach for the automatic selection and allocation of a disjoint ensemble of black-box predictors to answer predictive spatio-temporal queries. Our approach is divided into two parts -- offline and online. During the offline part, we preprocess the predictive domain data -- transforming it into a regular grid -- and the black-box models -- computing their spatio-temporal learning function. In the online part, we compute a DJEnsemble plan which minimizes a multivariate cost function based on estimates for the prediction error and the execution cost -- producing a model spatial allocation matrix -- and run the optimal ensemble plan. We conduct a set of extensive experiments that evaluate the DJEnsemble approach and highlight its efficiency. We show that our cost model produces plans with performance close to the actual best plan. When compared against the traditional ensemble approach, DJEnsemble achieves up to $4X$ improvement in execution time and almost $9X$ improvement in prediction accuracy. To the best of our knowledge, this is the first work to solve the problem of optimizing the allocation of black-box models to answer predictive spatio-temporal queries.
△ Less
Submitted 17 November, 2020; v1 submitted 22 May, 2020;
originally announced May 2020.
-
Transtemporal edges and crosslayer edges in incompressible high-order networks
Authors:
Felipe S. Abrahão,
Klaus Wehmuth,
Artur Ziviani
Abstract:
This work presents some outcomes of a theoretical investigation of incompressible high-order networks defined by a generalized graph representation. We study some of their network topological properties and how these may be related to real-world complex networks. We show that these networks have very short diameter, high k-connectivity, degrees of the order of half of the network size within a str…
▽ More
This work presents some outcomes of a theoretical investigation of incompressible high-order networks defined by a generalized graph representation. We study some of their network topological properties and how these may be related to real-world complex networks. We show that these networks have very short diameter, high k-connectivity, degrees of the order of half of the network size within a strong-asymptotically dominated standard deviation, and rigidity with respect to automorphisms. In addition, we demonstrate that incompressible dynamic (or dynamic multilayered) networks have transtemporal (or crosslayer) edges and, thus, a snapshot-like representation of dynamic networks is inaccurate for capturing the presence of such edges that compose underlying structures of some real-world networks.
△ Less
Submitted 13 May, 2019;
originally announced May 2019.
-
Learning the undecidable from networked systems
Authors:
Felipe S. Abrahão,
Ítala M. Loffredo D'Ottaviano,
Klaus Wehmuth,
Francisco Antônio Dória,
Artur Ziviani
Abstract:
This article presents a theoretical investigation of computation beyond the Turing barrier from emergent behavior in distributed systems. In particular, we present an algorithmic network that is a mathematical model of a networked population of randomly generated computable systems with a fixed communication protocol. Then, in order to solve an undecidable problem, we study how nodes (i.e., Turing…
▽ More
This article presents a theoretical investigation of computation beyond the Turing barrier from emergent behavior in distributed systems. In particular, we present an algorithmic network that is a mathematical model of a networked population of randomly generated computable systems with a fixed communication protocol. Then, in order to solve an undecidable problem, we study how nodes (i.e., Turing machines or computable systems) can harness the power of the metabiological selection and the power of information sharing (i.e., communication) through the network. Formally, we show that there is a pervasive network topological condition, in particular, the small-diameter phenomenon, that ensures that every node becomes capable of solving the halting problem for every program with a length upper bounded by a logarithmic order of the population size. In addition, we show that this result implies the existence of a central node capable of emergently solving the halting problem in the minimum number of communication rounds. Furthermore, we introduce an algorithmic-informational measure of synergy for networked computable systems, which we call local algorithmic synergy. Then, we show that such algorithmic network can produce an arbitrarily large value of expected local algorithmic synergy.
△ Less
Submitted 4 October, 2019; v1 submitted 8 April, 2019;
originally announced April 2019.
-
Expected Emergence of Algorithmic Information from a Lower Bound for Stationary Prevalence
Authors:
Felipe S. Abrahão,
Klaus Wehmuth,
Artur Ziviani
Abstract:
We study emergent information in populations of randomly generated networked computable systems that follow a Susceptible-Infected-Susceptible contagion (or infection) model of imitation of the fittest neighbor. These networks have a scale-free degree distribution in the form of a power-law following the Barabási-Albert model. We show that there is a lower bound for the stationary prevalence (or a…
▽ More
We study emergent information in populations of randomly generated networked computable systems that follow a Susceptible-Infected-Susceptible contagion (or infection) model of imitation of the fittest neighbor. These networks have a scale-free degree distribution in the form of a power-law following the Barabási-Albert model. We show that there is a lower bound for the stationary prevalence (or average density of infected nodes) that triggers an unlimited increase of the expected emergent algorithmic complexity (or information) of a node as the population size grows.
△ Less
Submitted 12 December, 2018;
originally announced December 2018.
-
On sequential structures in incompressible multidimensional networks
Authors:
Felipe S. Abrahão,
Klaus Wehmuth,
Hector Zenil,
Artur Ziviani
Abstract:
In order to deal with multidimensional structure representations of real-world networks, as well as with their worst-case irreducible information content analysis, the demand for new graph abstractions increases. This article investigates incompressible multidimensional networks defined by generalized graph representations. In particular, we mathematically study the lossless incompressibility of s…
▽ More
In order to deal with multidimensional structure representations of real-world networks, as well as with their worst-case irreducible information content analysis, the demand for new graph abstractions increases. This article investigates incompressible multidimensional networks defined by generalized graph representations. In particular, we mathematically study the lossless incompressibility of snapshot-dynamic networks and multiplex networks in comparison to the lossless incompressibility of more general forms of dynamic networks and multilayer networks, from which snapshot-dynamic networks or multiplex networks are particular cases. Our theoretical investigation first explores fundamental and basic conditions for connecting the sequential growth of information with sequential interdimensional structures such as time in dynamic networks, and secondly it presents open problems demanding future investigation. Although there may be a dissonance between sequential information dynamics and sequential topology in the general case, we demonstrate that incompressibility dissolves it, preventing both the algorithmic dynamics and the interdimensional structure of multidimensional networks from displaying a snapshot-like behavior (as characterized by any arbitrary mathematical theory). Thus, beyond methods based on statistics or probability as traditionally seen in random graphs and complex networks models, representational incompressibility implies a necessary underlying constraint in the multidimensional network topology. We argue that the study of how isomorphic transformations and their respective algorithmic information distortions can characterize sequential interdimensional structures in (multidimensional) networks helps the analysis of network topological properties while being agnostic to the chosen theory, algorithm, computation model, and programming language.
△ Less
Submitted 18 October, 2024; v1 submitted 3 December, 2018;
originally announced December 2018.
-
Algorithmic information distortions and incompressibility in uniform multidimensional networks
Authors:
Felipe S. Abrahão,
Klaus Wehmuth,
Hector Zenil,
Artur Ziviani
Abstract:
This article presents a theoretical investigation of generalized encoded forms of networks in a uniform multidimensional space. First, we study encoded networks with (finite) arbitrary node dimensions (or aspects), such as time instants or layers. In particular, we study these networks that are formalized in the form of multiaspect graphs. In the context of node-aligned non-uniform (or node-unalig…
▽ More
This article presents a theoretical investigation of generalized encoded forms of networks in a uniform multidimensional space. First, we study encoded networks with (finite) arbitrary node dimensions (or aspects), such as time instants or layers. In particular, we study these networks that are formalized in the form of multiaspect graphs. In the context of node-aligned non-uniform (or node-unaligned non-uniform and uniform) multidimensional spaces, previous results has shown that, unlike classical graphs, the algorithmic information of a multidimensional network is not in general dominated by the algorithmic information of the binary sequence that determines the presence or absence of edges. In the present work, first we demonstrate the existence of such algorithmic information distortions for node-aligned uniform multidimensional networks. Secondly, we show that there are particular cases of infinite nesting families of finite uniform multidimensional networks such that each member of these families is incompressible. From these results, we also recover the network topological properties and equivalences in irreducible information content of multidimensional networks in comparison to their isomorphic classical graph counterpart in the previous literature. These results together establish a universal algorithmic approach and set limitations and conditions for irreducible information content analysis in comparing arbitrary networks with a large number of dimensions, such as multilayer networks.
△ Less
Submitted 21 April, 2023; v1 submitted 27 October, 2018;
originally announced October 2018.
-
A survey of biodiversity informatics: Concepts, practices, and challenges
Authors:
Luiz M. R. Gadelha Jr.,
Pedro C. de Siracusa,
Artur Ziviani,
Eduardo Couto Dalcin,
Helen Michelle Affe,
Marinez Ferreira de Siqueira,
Luís Alexandre Estevão da Silva,
Douglas A. Augusto,
Eduardo Krempser,
Marcia Chame,
Raquel Lopes Costa,
Pedro Milet Meirelles,
Fabiano Thompson
Abstract:
The unprecedented size of the human population, along with its associated economic activities, have an ever increasing impact on global environments. Across the world, countries are concerned about the growing resource consumption and the capacity of ecosystems to provide them. To effectively conserve biodiversity, it is essential to make indicators and knowledge openly available to decision-maker…
▽ More
The unprecedented size of the human population, along with its associated economic activities, have an ever increasing impact on global environments. Across the world, countries are concerned about the growing resource consumption and the capacity of ecosystems to provide them. To effectively conserve biodiversity, it is essential to make indicators and knowledge openly available to decision-makers in ways that they can effectively use them. The development and deployment of mechanisms to produce these indicators depend on having access to trustworthy data from field surveys and automated sensors, biological collections, molecular data, and historic academic literature. The transformation of this raw data into synthesized information that is fit for use requires going through many refinement steps. The methodologies and techniques used to manage and analyze this data comprise an area often called biodiversity informatics (or e-Biodiversity). Biodiversity data follows a life cycle consisting of planning, collection, certification, description, preservation, discovery, integration, and analysis. Researchers, whether producers or consumers of biodiversity data, will likely perform activities related to at least one of these steps. This article explores each stage of the life cycle of biodiversity data, discussing its methodologies, tools, and challenges.
△ Less
Submitted 7 December, 2020; v1 submitted 29 September, 2018;
originally announced October 2018.
-
Emergent Open-Endedness from Contagion of the Fittest
Authors:
Felipe S. Abrahão,
Klaus Wehmuth,
Artur Ziviani
Abstract:
In this paper, we study emergent irreducible information in populations of randomly generated computable systems that are networked and follow a "Susceptible-Infected-Susceptible" contagion model of imitation of the fittest neighbor. We show that there is a lower bound for the stationary prevalence (or average density of "infected" nodes) that triggers an unlimited increase of the expected local e…
▽ More
In this paper, we study emergent irreducible information in populations of randomly generated computable systems that are networked and follow a "Susceptible-Infected-Susceptible" contagion model of imitation of the fittest neighbor. We show that there is a lower bound for the stationary prevalence (or average density of "infected" nodes) that triggers an unlimited increase of the expected local emergent algorithmic complexity (or information) of a node as the population size grows. We call this phenomenon expected (local) emergent open-endedness. In addition, we show that static networks with a power-law degree distribution following the Barabási-Albert model satisfy this lower bound and, thus, display expected (local) emergent open-endedness.
△ Less
Submitted 21 June, 2018; v1 submitted 16 June, 2018;
originally announced June 2018.
-
A Multilayer and Time-varying Structural Analysis of the Brazilian Air Transportation Network
Authors:
Bernardo Costa,
João Victor Bechara,
Klaus Wehmuth,
Artur Ziviani
Abstract:
This paper provides a multilayer and time-varying structural analysis of one air transportation network, having the Brazilian air transportation network as a case study. Using a single mathematical object called MultiAspect Graph (MAG) for this analysis, the multilayer perspective enables the unveiling of the particular strategies of each airline to both establish and adapt in a moment of crisis i…
▽ More
This paper provides a multilayer and time-varying structural analysis of one air transportation network, having the Brazilian air transportation network as a case study. Using a single mathematical object called MultiAspect Graph (MAG) for this analysis, the multilayer perspective enables the unveiling of the particular strategies of each airline to both establish and adapt in a moment of crisis its specific flight network. Similarly, the time-varying perspective allows multi-scale analysis considering different time periods, and thus assessing the impact of the economic crisis on how the different airlines establish their routes as well as the flights that use these routes. Altogether, besides the multilayer and time-varying structural analysis of the Brazilian air transportation network, this paper also acts as a proof-of-concept for the MAG potential for the modeling and analysis of high-order networks.
△ Less
Submitted 19 July, 2018; v1 submitted 28 August, 2017;
originally announced September 2017.
-
Algorithmic Networks: central time to trigger expected emergent open-endedness
Authors:
Felipe S. Abrahão,
Klaus Wehmuth,
Artur Ziviani
Abstract:
This article investigates emergence and complexity in complex systems that can share information on a network. To this end, we use a theoretical approach from information theory, computability theory, and complex networks. One key studied question is how much emergent complexity (or information) arises when a population of computable systems is networked compared with when this population is isola…
▽ More
This article investigates emergence and complexity in complex systems that can share information on a network. To this end, we use a theoretical approach from information theory, computability theory, and complex networks. One key studied question is how much emergent complexity (or information) arises when a population of computable systems is networked compared with when this population is isolated. First, we define a general model for networked theoretical machines, which we call algorithmic networks. Then, we narrow our scope to investigate algorithmic networks that optimize the average fitnesses of nodes in a scenario in which each node imitates the fittest neighbor and the randomly generated population is networked by a time-varying graph. We show that there are graph-topological conditions that cause these algorithmic networks to have the property of expected emergent open-endedness for large enough populations. In other words, the expected emergent algorithmic complexity of a node tends to infinity as the population size tends to infinity. Given a dynamic network, we show that these conditions imply the existence of a central time to trigger expected emergent open-endedness. Moreover, we show that networks with small diameter compared to the network size meet these conditions. We also discuss future research based on how our results are related to some problems in network science, information theory, computability theory, distributed computing, game theory, evolutionary biology, and synergy in complex systems.
△ Less
Submitted 19 March, 2019; v1 submitted 30 August, 2017;
originally announced August 2017.
-
Social Events in a Time-Varying Mobile Phone Graph
Authors:
Carlos Sarraute,
Jorge Brea,
Javier Burroni,
Klaus Wehmuth,
Artur Ziviani,
J. I. Alvarez-Hamelin
Abstract:
The large-scale study of human mobility has been significantly enhanced over the last decade by the massive use of mobile phones in urban populations. Studying the activity of mobile phones allows us, not only to infer social networks between individuals, but also to observe the movements of these individuals in space and time. In this work, we investigate how these two related sources of informat…
▽ More
The large-scale study of human mobility has been significantly enhanced over the last decade by the massive use of mobile phones in urban populations. Studying the activity of mobile phones allows us, not only to infer social networks between individuals, but also to observe the movements of these individuals in space and time. In this work, we investigate how these two related sources of information can be integrated within the context of detecting and analyzing large social events. We show that large social events can be characterized not only by an anomalous increase in activity of the antennas in the neighborhood of the event, but also by an increase in social relationships of the attendants present in the event. Moreover, having detected a large social event via increased antenna activity, we can use the network connections to infer whether an unobserved user was present at the event. More precisely, we address the following three challenges: (i) automatically detecting large social events via increased antenna activity; (ii) characterizing the social cohesion of the detected event; and (iii) analyzing the feasibility of inferring whether unobserved users were in the event.
△ Less
Submitted 19 June, 2017;
originally announced June 2017.
-
MultiAspect Graphs: Algebraic representation and algorithms
Authors:
Klaus Wehmuth,
Éric Fleury,
Artur Ziviani
Abstract:
We present the algebraic representation and basic algorithms for MultiAspect Graphs (MAGs). A MAG is a structure capable of representing multilayer and time-varying networks, as well as higher-order networks, while also having the property of being isomorphic to a directed graph. In particular, we show that, as a consequence of the properties associated with the MAG structure, a MAG can be represe…
▽ More
We present the algebraic representation and basic algorithms for MultiAspect Graphs (MAGs). A MAG is a structure capable of representing multilayer and time-varying networks, as well as higher-order networks, while also having the property of being isomorphic to a directed graph. In particular, we show that, as a consequence of the properties associated with the MAG structure, a MAG can be represented in matrix form. Moreover, we also show that any possible MAG function (algorithm) can be obtained from this matrix-based representation. This is an important theoretical result since it paves the way for adapting well-known graph algorithms for application in MAGs. We present a set of basic MAG algorithms, constructed from well-known graph algorithms, such as degree computing, Breadth First Search (BFS), and Depth First Search (DFS). These algorithms adapted to the MAG context can be used as primitives for building other more sophisticated MAG algorithms. Therefore, such examples can be seen as guidelines on how to properly derive MAG algorithms from basic algorithms on directed graph. We also make available Python implementations of all the algorithms presented in this paper.
△ Less
Submitted 26 September, 2016; v1 submitted 29 April, 2015;
originally announced April 2015.
-
Time Centrality in Dynamic Complex Networks
Authors:
Eduardo Chinelate Costa,
Alex Borges Vieira,
Klaus Wehmuth,
Artur Ziviani,
Ana Paula Couto da Silva
Abstract:
There is an ever-increasing interest in investigating dynamics in time-varying graphs (TVGs). Nevertheless, so far, the notion of centrality in TVG scenarios usually refers to metrics that assess the relative importance of nodes along the temporal evolution of the dynamic complex network. For some TVG scenarios, however, more important than identifying the central nodes under a given node centrali…
▽ More
There is an ever-increasing interest in investigating dynamics in time-varying graphs (TVGs). Nevertheless, so far, the notion of centrality in TVG scenarios usually refers to metrics that assess the relative importance of nodes along the temporal evolution of the dynamic complex network. For some TVG scenarios, however, more important than identifying the central nodes under a given node centrality definition is identifying the key time instants for taking certain actions. In this paper, we thus introduce and investigate the notion of time centrality in TVGs. Analogously to node centrality, time centrality evaluates the relative importance of time instants in dynamic complex networks. In this context, we present two time centrality metrics related to diffusion processes. We evaluate the two defined metrics using both a real-world dataset representing an in-person contact dynamic network and a synthetically generated randomized TVG. We validate the concept of time centrality showing that diffusion starting at the best classified time instants (i.e. the most central ones), according to our metrics, can perform a faster and more efficient diffusion process.
△ Less
Submitted 5 September, 2015; v1 submitted 1 April, 2015;
originally announced April 2015.
-
On MultiAspect Graphs
Authors:
Klaus Wehmuth,
Éric Fleury,
Artur Ziviani
Abstract:
Different graph generalizations have been recently used in an ad-hoc manner to represent multilayer networks, i.e. systems formed by distinct layers where each layer can be seen as a network. Similar constructions have also been used to represent time-varying networks. We introduce the concept of MultiAspect Graph (MAG) as a graph generalization that we prove to be isomorphic to a directed graph,…
▽ More
Different graph generalizations have been recently used in an ad-hoc manner to represent multilayer networks, i.e. systems formed by distinct layers where each layer can be seen as a network. Similar constructions have also been used to represent time-varying networks. We introduce the concept of MultiAspect Graph (MAG) as a graph generalization that we prove to be isomorphic to a directed graph, and also capable of representing all previous generalizations. In our proposal, the set of vertices, layers, time instants, or any other independent features are considered as an aspect of the MAG. For instance, a MAG is able to represent multilayer or time-varying networks, while both concepts can also be combined to represent a multilayer time-varying network and even other higher-order networks. Since the MAG structure admits an arbitrary (finite) number of aspects, it hence introduces a powerful modelling abstraction for networked complex systems. This paper formalizes the concept of MAG and derives theoretical results useful in the analysis of complex networked systems modelled using the proposed MAG abstraction. We also present an overview of the MAG applicability.
△ Less
Submitted 14 September, 2016; v1 submitted 5 August, 2014;
originally announced August 2014.
-
A Unifying Model for Representing Time-Varying Graphs
Authors:
Klaus Wehmuth,
Artur Ziviani,
Eric Fleury
Abstract:
Graph-based models form a fundamental aspect of data representation in Data Sciences and play a key role in modeling complex networked systems. In particular, recently there is an ever-increasing interest in modeling dynamic complex networks, i.e. networks in which the topological structure (nodes and edges) may vary over time. In this context, we propose a novel model for representing finite disc…
▽ More
Graph-based models form a fundamental aspect of data representation in Data Sciences and play a key role in modeling complex networked systems. In particular, recently there is an ever-increasing interest in modeling dynamic complex networks, i.e. networks in which the topological structure (nodes and edges) may vary over time. In this context, we propose a novel model for representing finite discrete Time-Varying Graphs (TVGs), which are typically used to model dynamic complex networked systems. We analyze the data structures built from our proposed model and demonstrate that, for most practical cases, the asymptotic memory complexity of our model is in the order of the cardinality of the set of edges. Further, we show that our proposal is an unifying model that can represent several previous (classes of) models for dynamic networks found in the recent literature, which in general are unable to represent each other. In contrast to previous models, our proposal is also able to intrinsically model cyclic (i.e. periodic) behavior in dynamic networks. These representation capabilities attest the expressive power of our proposed unifying model for TVGs. We thus believe our unifying model for TVGs is a step forward in the theoretical foundations for data analysis of complex networked systems.
△ Less
Submitted 17 September, 2015; v1 submitted 14 February, 2014;
originally announced February 2014.
-
DANCE: A Framework for the Distributed Assessment of Network Centralities
Authors:
Klaus Wehmuth,
Antonio Tadeu A. Gomes,
Artur Ziviani
Abstract:
The analysis of large-scale complex networks is a major challenge in the Big Data domain. Given the large-scale of the complex networks researchers commonly deal with nowadays, the use of localized information (i.e. restricted to a limited neighborhood around each node of the network) for centrality-based analysis is gaining momentum in the recent literature. In this context, we propose a framewor…
▽ More
The analysis of large-scale complex networks is a major challenge in the Big Data domain. Given the large-scale of the complex networks researchers commonly deal with nowadays, the use of localized information (i.e. restricted to a limited neighborhood around each node of the network) for centrality-based analysis is gaining momentum in the recent literature. In this context, we propose a framework for the Distributed Assessment of Network Centralities (DANCE) in complex networks. DANCE offers a single environment that allows the use of different localized centrality proposals, which can be tailored to specific applications. This environment can be thus useful given the vast potential applicability of centrality-based analysis on large-scale complex networks found in different areas, such as Biology, Physics, Sociology, or Computer Science. Since the localized centrality proposals DANCE implements employ only localized information, DANCE can easily benefit from parallel processing environments and run on different computing architectures. To illustrate this, we present a parallel implementation of DANCE and show how it can be applied to the analysis of large-scale complex networks using different kinds of network centralities. This implementation is made available to complex network researchers and practitioners interested in using it through a scientific web portal.
△ Less
Submitted 17 April, 2014; v1 submitted 4 August, 2011;
originally announced August 2011.
-
Distributed Algorithm to Locate Critical Nodes to Network Robustness based on Spectral Analysis
Authors:
Klaus Wehmuth,
Artur Ziviani
Abstract:
We propose an algorithm to locate the most critical nodes to network robustness. Such critical nodes may be thought of as those most related to the notion of network centrality. Our proposal relies only on a localized spectral analysis of a limited subnetwork centered at each node in the network. We also present a procedure allowing the navigation from any node towards a critical node following on…
▽ More
We propose an algorithm to locate the most critical nodes to network robustness. Such critical nodes may be thought of as those most related to the notion of network centrality. Our proposal relies only on a localized spectral analysis of a limited subnetwork centered at each node in the network. We also present a procedure allowing the navigation from any node towards a critical node following only local information computed by the proposed algorithm. Experimental results confirm the effectiveness of our proposal considering networks of different scales and topological characteristics.
△ Less
Submitted 2 August, 2011; v1 submitted 26 January, 2011;
originally announced January 2011.
-
Capacity Planning for Vertical Search Engines
Authors:
Claudine Badue,
Jussara Almeida,
Virgilio Almeida,
Ricardo Baeza-Yates,
Berthier Ribeiro-Neto,
Artur Ziviani,
Nivio Ziviani
Abstract:
Vertical search engines focus on specific slices of content, such as the Web of a single country or the document collection of a large corporation. Despite this, like general open web search engines, they are expensive to maintain, expensive to operate, and hard to design. Because of this, predicting the response time of a vertical search engine is usually done empirically through experimentation,…
▽ More
Vertical search engines focus on specific slices of content, such as the Web of a single country or the document collection of a large corporation. Despite this, like general open web search engines, they are expensive to maintain, expensive to operate, and hard to design. Because of this, predicting the response time of a vertical search engine is usually done empirically through experimentation, requiring a costly setup. An alternative is to develop a model of the search engine for predicting performance. However, this alternative is of interest only if its predictions are accurate. In this paper we propose a methodology for analyzing the performance of vertical search engines. Applying the proposed methodology, we present a capacity planning model based on a queueing network for search engines with a scale typically suitable for the needs of large corporations. The model is simple and yet reasonably accurate and, in contrast to previous work, considers the imbalance in query service times among homogeneous index servers. We discuss how we tune up the model and how we apply it to predict the impact on the query response time when parameters such as CPU and disk capacities are changed. This allows a manager of a vertical search engine to determine a priori whether a new configuration of the system might keep the query response under specified performance constraints.
△ Less
Submitted 25 June, 2010;
originally announced June 2010.