-
Modeling individual attention dynamics on online social media
Authors:
Jaume Ojer,
Filippo Radicchi,
Santo Fortunato,
Michele Starnini,
Romualdo Pastor-Satorras
Abstract:
In the attention economy, understanding how individuals manage limited attention is critical. We introduce a simple model describing the decay of a user's engagement when facing multiple inputs. We analytically show that individual attention decay is determined by the overall duration of interactions, not their number or user activity. Our model is validated using data from Reddit's Change My View…
▽ More
In the attention economy, understanding how individuals manage limited attention is critical. We introduce a simple model describing the decay of a user's engagement when facing multiple inputs. We analytically show that individual attention decay is determined by the overall duration of interactions, not their number or user activity. Our model is validated using data from Reddit's Change My View subreddit, where the user's attention dynamics is explicitly traceable. Despite its simplicity, our model offers a crucial microscopic perspective complementing macroscopic studies.
△ Less
Submitted 2 July, 2025;
originally announced July 2025.
-
Modeling resource consumption in the US air transportation system via minimum-cost percolation
Authors:
Minsuk Kim,
C. Tyler Diggans,
Filippo Radicchi
Abstract:
We introduce a dynamic percolation model aimed at describing the consumption, and eventual exhaustion, of resources in transportation networks. In the model, rational agents progressively consume the edges of a network along demanded minimum-cost paths. As a result, the network undergoes a transition between a percolating phase where it can properly serve demand to a non-percolating phase where de…
▽ More
We introduce a dynamic percolation model aimed at describing the consumption, and eventual exhaustion, of resources in transportation networks. In the model, rational agents progressively consume the edges of a network along demanded minimum-cost paths. As a result, the network undergoes a transition between a percolating phase where it can properly serve demand to a non-percolating phase where demand can no longer be supplied. We apply the model to a weighted, directed, temporal, multi-layer network representation of the air transportation system that can be generated using real schedules of commercial flights operated by US carriers. We study how cooperation among different carriers could improve the ability of the overall air transportation system in serving the demand of passengers, finding that unrestricted cooperation could lead to a 30% efficiency increase compared to the non-cooperative scenario. Cooperation would require major airlines to share a significant portion of their market, but it would allow also for an increased robustness of the system against perturbations causing flight cancellations. Our findings underscore some key benefits that could emerge by simply promoting code-share arrangements among US airlines without altering their current cost of operation.
△ Less
Submitted 5 April, 2025;
originally announced April 2025.
-
Efficient inference of rankings from multi-body comparisons
Authors:
Jack Yeung,
Daniel Kaiser,
Filippo Radicchi
Abstract:
Many of the existing approaches to assess and predict the performance of players, teams or products in competitive contests rely on the assumption that comparisons occur between pairs of such entities. There are, however, several real contests where more than two entities are part of each comparison, e.g., sports tournaments,multiplayer board and card games, and preference surveys. The Plackett-Lu…
▽ More
Many of the existing approaches to assess and predict the performance of players, teams or products in competitive contests rely on the assumption that comparisons occur between pairs of such entities. There are, however, several real contests where more than two entities are part of each comparison, e.g., sports tournaments,multiplayer board and card games, and preference surveys. The Plackett-Luce (PL) model provides a principled approach to infer the ranking of entities involved in such contests characterized by multi-body comparisons. Unfortunately, traditional algorithms used to compute PL rankings suffer from slow convergence limiting the application of the PL model to relatively small-scale systems. We present here an alternative implementation that allows for significant speed-ups and validate its efficiency in both synthetic and real-world sets of data. Further, we perform systematic cross-validation tests concerning the ability of the PL model to predict unobserved comparisons. We find that a PL model trained on a set composed of multi-body comparisons is more predictive than a PL model trained on a set of projected pairwise comparisons derived from the very same training set, emphasizing the need of properly accounting for the true multi-body nature of real-world systems whenever such an information is available.
△ Less
Submitted 27 January, 2025;
originally announced January 2025.
-
Shortest-path percolation on random networks
Authors:
Minsuk Kim,
Filippo Radicchi
Abstract:
We propose a bond-percolation model intended to describe the consumption, and eventual exhaustion, of resources in transport networks. Edges forming minimum-length paths connecting demanded origin-destination nodes are removed if below a certain budget. As pairs of nodes are demanded and edges are removed, the macroscopic connected component of the graph disappears, i.e., the graph undergoes a per…
▽ More
We propose a bond-percolation model intended to describe the consumption, and eventual exhaustion, of resources in transport networks. Edges forming minimum-length paths connecting demanded origin-destination nodes are removed if below a certain budget. As pairs of nodes are demanded and edges are removed, the macroscopic connected component of the graph disappears, i.e., the graph undergoes a percolation transition. Here, we study such a shortest-path-percolation transition in homogeneous random graphs where pairs of demanded origin-destination nodes are randomly generated, and fully characterize it by means of finite-size scaling analysis. If budget is finite, the transition is identical to the one of ordinary percolation, where a single giant cluster shrinks as edges are removed from the graph; for infinite budget, the transition becomes more abrupt than the one of ordinary percolation, being characterized by the sudden fragmentation of the giant connected component into a multitude of clusters of similar size.
△ Less
Submitted 29 July, 2024; v1 submitted 9 February, 2024;
originally announced February 2024.
-
Symmetry breaking in optimal transport networks
Authors:
Siddharth Patwardhan,
Marc Barthelemy,
Sirag Erkol,
Santo Fortunato,
Filippo Radicchi
Abstract:
Despite its importance for practical applications, not much is known about the optimal shape of a network that connects in an efficient way a set of points. This problem can be formulated in terms of a multiplex network with a fast layer embedded in a slow one. To connect a pair of points, one can then use either the fast or slow layer, or both, with a switching cost when going from one layer to t…
▽ More
Despite its importance for practical applications, not much is known about the optimal shape of a network that connects in an efficient way a set of points. This problem can be formulated in terms of a multiplex network with a fast layer embedded in a slow one. To connect a pair of points, one can then use either the fast or slow layer, or both, with a switching cost when going from one layer to the other. We consider here distributions of points in spaces of arbitrary dimension d and search for the fast-layer network of given size that minimizes the average time to reach a central node. We discuss the d = 1 case analytically and the d > 1 case numerically, and show the existence of transitions when we vary the network size, the switching cost and/or the relative speed of the two layers. Surprisingly, there is a transition characterized by a symmetry breaking indicating that it is sometimes better to avoid serving a whole area in order to save on switching costs, at the expense of using more the slow layer. Our findings underscore the importance of considering switching costs while studying optimal network structures, as small variations of the cost can lead to strikingly dissimilar results. Finally, we discuss real-world subways and their efficiency for the cities of Atlanta, Boston, and Toronto. We find that real subways are farther away from the optimal shapes as traffic congestion increases.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
Reconstruction of multiplex networks via graph embeddings
Authors:
Daniel Kaiser,
Siddharth Patwardhan,
Minsuk Kim,
Filippo Radicchi
Abstract:
Multiplex networks are collections of networks with identical nodes but distinct layers of edges. They are genuine representations for a large variety of real systems whose elements interact in multiple fashions or flavors. However, multiplex networks are not always simple to observe in the real world; often, only partial information on the layer structure of the networks is available, whereas the…
▽ More
Multiplex networks are collections of networks with identical nodes but distinct layers of edges. They are genuine representations for a large variety of real systems whose elements interact in multiple fashions or flavors. However, multiplex networks are not always simple to observe in the real world; often, only partial information on the layer structure of the networks is available, whereas the remaining information is in the form of aggregated, single-layer networks. Recent works have proposed solutions to the problem of reconstructing the hidden multiplexity of single-layer networks using tools proper of network science. Here, we develop a machine learning framework that takes advantage of graph embeddings, i.e., representations of networks in geometric space. We validate the framework in systematic experiments aimed at the reconstruction of synthetic and real-world multiplex networks, providing evidence that our proposed framework not only accomplishes its intended task, but often outperforms existing reconstruction techniques.
△ Less
Submitted 26 February, 2024; v1 submitted 19 October, 2023;
originally announced October 2023.
-
Network community detection via neural embeddings
Authors:
Sadamori Kojaku,
Filippo Radicchi,
Yong-Yeol Ahn,
Santo Fortunato
Abstract:
Recent advances in machine learning research have produced powerful neural graph embedding methods, which learn useful, low-dimensional vector representations of network data. These neural methods for graph embedding excel in graph machine learning tasks and are now widely adopted. However, how and why these methods work -- particularly how network structure gets encoded in the embedding -- remain…
▽ More
Recent advances in machine learning research have produced powerful neural graph embedding methods, which learn useful, low-dimensional vector representations of network data. These neural methods for graph embedding excel in graph machine learning tasks and are now widely adopted. However, how and why these methods work -- particularly how network structure gets encoded in the embedding -- remain largely unexplained. Here, we show that node2vec -- shallow, linear neural network -- encodes communities into separable clusters better than random partitioning down to the information-theoretic detectability limit for the stochastic block models. We show that this is due to the equivalence between the embedding learned by node2vec and the spectral embedding via the eigenvectors of the symmetric normalized Laplacian matrix. Numerical simulations demonstrate that node2vec is capable of learning communities on sparse graphs generated by the stochastic blockmodel, as well as on sparse degree-heterogeneous networks. Our results highlight the features of graph neural networks that enable them to separate communities in embedding space.
△ Less
Submitted 1 November, 2024; v1 submitted 23 June, 2023;
originally announced June 2023.
-
Epidemic spreading in group-structured populations
Authors:
Siddharth Patwardhan,
Varun K. Rao,
Santo Fortunato,
Filippo Radicchi
Abstract:
Individuals involved in common group activities/settings -- e.g., college students that are enrolled in the same class and/or live in the same dorm -- are exposed to recurrent contacts of physical proximity. These contacts are known to mediate the spread of an infectious disease, however, it is not obvious how the properties of the spreading process are determined by the structure of and the inter…
▽ More
Individuals involved in common group activities/settings -- e.g., college students that are enrolled in the same class and/or live in the same dorm -- are exposed to recurrent contacts of physical proximity. These contacts are known to mediate the spread of an infectious disease, however, it is not obvious how the properties of the spreading process are determined by the structure of and the interrelation among the group settings that are at the root of those recurrent interactions. Here, we show that reshaping the organization of groups within a population can be used as an effective strategy to decrease the severity of an epidemic. Specifically, we show that when group structures are sufficiently correlated -- e.g., the likelihood for two students living in the same dorm to attend the same class is sufficiently high -- outbreaks are longer but milder than for uncorrelated group structures. Also, we show that the effectiveness of interventions for disease containment increases as the correlation among group structures increases. We demonstrate the practical relevance of our findings by taking advantage of data about housing and attendance of students at the Indiana University campus in Bloomington. By appropriately optimizing the assignment of students to dorms based on their enrollment, we are able to observe a two- to five-fold reduction in the severity of simulated epidemic processes.
△ Less
Submitted 21 October, 2024; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Heterogeneous message passing for heterogeneous networks
Authors:
George T. Cantwell,
Alec Kirkley,
Filippo Radicchi
Abstract:
Message passing (MP) is a computational technique used to find approximate solutions to a variety of problems defined on networks. MP approximations are generally accurate in locally tree-like networks but require corrections to maintain their accuracy level in networks rich with short cycles. However, MP may already be computationally challenging on very large networks and additional costs incurr…
▽ More
Message passing (MP) is a computational technique used to find approximate solutions to a variety of problems defined on networks. MP approximations are generally accurate in locally tree-like networks but require corrections to maintain their accuracy level in networks rich with short cycles. However, MP may already be computationally challenging on very large networks and additional costs incurred by correcting for cycles could be prohibitive. We show how the issue can be addressed. By allowing each node in the network to have its own level of approximation, one can focus on improving the accuracy of MP approaches in a targeted manner. We perform a systematic analysis of 109 real-world networks and show that our node-based MP approximation is able to increase both the accuracy and speed of traditional MP approaches. We find that, compared to conventional MP, a heterogeneous approach based on a simple heuristic is more accurate in 81% of tested networks, faster in 64% of cases, and both more accurate and faster in 49% of cases.
△ Less
Submitted 26 September, 2023; v1 submitted 3 May, 2023;
originally announced May 2023.
-
Dynamical Methods for Target Control of Biological Networks
Authors:
Thomas Parmer,
Filippo Radicchi
Abstract:
Estimating the influence that individual nodes have on one another in a Boolean network is essential to predict and control the system's dynamical behavior, for example, detecting key therapeutic targets to control pathways in models of biological signaling and regulation. Exact estimation is generally not possible due to the fact that the number of configurations that must be considered grows exp…
▽ More
Estimating the influence that individual nodes have on one another in a Boolean network is essential to predict and control the system's dynamical behavior, for example, detecting key therapeutic targets to control pathways in models of biological signaling and regulation. Exact estimation is generally not possible due to the fact that the number of configurations that must be considered grows exponentially with the system size. However, approximate, scalable methods exist in the literature. These methods can be divided in two main classes: (i) graph-theoretic methods that rely on representations of Boolean dynamics into static graphs, (ii) and mean-field approaches that describe average trajectories of the system but neglect dynamical correlations. Here, we compare systematically the performance of these state-of-the-art methods on a large collection of real-world gene regulatory networks. We find comparable performance across methods. All methods underestimate the ground truth, with mean-field approaches having a better recall but a worse precision than graph-theoretic methods. Computationally speaking, graph-theoretic methods are faster than mean-field ones in sparse networks, but are slower in dense networks. The preference of which method to use, therefore, depends on a network's connectivity and the relative importance of recall vs. precision for the specific application at hand.
△ Less
Submitted 1 November, 2023; v1 submitted 20 April, 2023;
originally announced April 2023.
-
Critical avalanches of Susceptible-Infected-Susceptible dynamics in finite networks
Authors:
Daniele Notarmuzi,
Alessandro Flammini,
Claudio Castellano,
Filippo Radicchi
Abstract:
We investigate the avalanche temporal statistics of the Susceptible-Infected-Susceptible (SIS) model when the dynamics is critical and takes place on finite random networks. By considering numerical simulations on annealed topologies we show that the survival probability always exhibits three distinct dynamical regimes. Size-dependent crossover timescales separating them scale differently for homo…
▽ More
We investigate the avalanche temporal statistics of the Susceptible-Infected-Susceptible (SIS) model when the dynamics is critical and takes place on finite random networks. By considering numerical simulations on annealed topologies we show that the survival probability always exhibits three distinct dynamical regimes. Size-dependent crossover timescales separating them scale differently for homogeneous and for heterogeneous networks. The phenomenology can be qualitatively understood based on known features of the SIS dynamics on networks. A fully quantitative approach based on Langevin theory is shown to perfectly reproduce the results for homogeneous networks, while failing in the heterogeneous case. The analysis is extended to quenched random networks, which behave in agreement with the annealed case for strongly homogeneous and strongly heterogeneous networks.
△ Less
Submitted 12 January, 2023;
originally announced January 2023.
-
Consistency pays off in science
Authors:
Sirag Erkol,
Satyaki Sikdar,
Filippo Radicchi,
Santo Fortunato
Abstract:
The exponentially growing number of scientific papers stimulates a discussion on the interplay between quantity and quality in science. In particular, one may wonder which publication strategy may offer more chances of success: publishing lots of papers, producing a few hit papers, or something in between. Here we tackle this question by studying the scientific portfolios of Nobel Prize laureates.…
▽ More
The exponentially growing number of scientific papers stimulates a discussion on the interplay between quantity and quality in science. In particular, one may wonder which publication strategy may offer more chances of success: publishing lots of papers, producing a few hit papers, or something in between. Here we tackle this question by studying the scientific portfolios of Nobel Prize laureates. A comparative analysis of different citation-based indicators of individual impact suggests that the best path to success may rely on consistently producing high-quality work. Such a pattern is especially rewarded by a new metric, the $E$-index, which identifies excellence better than state-of-the-art measures.
△ Less
Submitted 11 May, 2023; v1 submitted 16 October, 2022;
originally announced October 2022.
-
Influence Maximization: Divide and Conquer
Authors:
Siddharth Patwardhan,
Filippo Radicchi,
Santo Fortunato
Abstract:
The problem of influence maximization, i.e., finding the set of nodes having maximal influence on a network, is of great importance for several applications. In the past two decades, many heuristic metrics to spot influencers have been proposed. Here, we introduce a framework to boost the performance of any such metric. The framework consists in dividing the network into sectors of influence, and…
▽ More
The problem of influence maximization, i.e., finding the set of nodes having maximal influence on a network, is of great importance for several applications. In the past two decades, many heuristic metrics to spot influencers have been proposed. Here, we introduce a framework to boost the performance of any such metric. The framework consists in dividing the network into sectors of influence, and then selecting the most influential nodes within these sectors. We explore three different methodologies to find sectors in a network: graph partitioning, graph hyperbolic embedding, and community structure. The framework is validated with a systematic analysis of real and synthetic networks. We show that the gain in performance generated by dividing a network into sectors before selecting the influential spreaders increases as the modularity and heterogeneity of the network increase. Also, we show that the division of the network into sectors can be efficiently performed in a time that scales linearly with the network size, thus making the framework applicable to large-scale influence maximization problems.
△ Less
Submitted 6 October, 2022; v1 submitted 3 October, 2022;
originally announced October 2022.
-
Embedding-aided network dismantling
Authors:
Saeed Osat,
Fragkiskos Papadopoulos,
Andreia Sofia Teixeira,
Filippo Radicchi
Abstract:
Optimal percolation concerns the identification of the minimum-cost strategy for the destruction of any extensive connected components in a network. Solutions of such a dismantling problem are important for the design of optimal strategies of disease containment based either on immunization or social distancing. Depending on the specific variant of the problem considered, network dismantling is pe…
▽ More
Optimal percolation concerns the identification of the minimum-cost strategy for the destruction of any extensive connected components in a network. Solutions of such a dismantling problem are important for the design of optimal strategies of disease containment based either on immunization or social distancing. Depending on the specific variant of the problem considered, network dismantling is performed via the removal of nodes or edges, and different cost functions are associated to the removal of these microscopic elements. In this paper, we show that network representations in geometric space can be used to solve several variants of the network dismantling problem in a coherent fashion. Once a network is embedded, dismantling is implemented using intuitive geometric strategies. We demonstrate that the approach well suits both Euclidean and hyperbolic network embeddings. Our systematic analysis on synthetic and real networks demonstrates that the performance of embedding-aided techniques is comparable to, if not better than, the one of the best dismantling algorithms currently available on the market.
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
-
Multiplex reconstruction with partial information
Authors:
Daniel Kaiser,
Siddharth Patwardhan,
Filippo Radicchi
Abstract:
A multiplex is a collection of network layers, each representing a specific type of edges. This appears to be a genuine representation for many real-world systems. However, due to a variety of potential factors, such as limited budget and equipment, or physical impossibility, multiplex data can be difficult to observe directly. Often, only partial information on the layer structure of the system i…
▽ More
A multiplex is a collection of network layers, each representing a specific type of edges. This appears to be a genuine representation for many real-world systems. However, due to a variety of potential factors, such as limited budget and equipment, or physical impossibility, multiplex data can be difficult to observe directly. Often, only partial information on the layer structure of the system is available, whereas the remaining information is in the form of a single-layer network. In this work, we face the problem of reconstructing the hidden multiplex structure of an aggregated network from partial information. We propose an algorithm that leverages the layer-wise community structure that can be learned from partial observations to reconstruct the ground-truth topology of the unobserved part of the multiplex. The algorithm is characterized by a computational time that grows linearly with the network size. We perform a systematic study of reconstruction problems for both synthetic and real-world multiplex networks. We show that the ability of the proposed method to solve the reconstruction problem is affected by the heterogeneity of the individual layers and the similarity among the layers. On real-world networks, we observe that the accuracy of the reconstruction saturates quickly as the amount of available information increases. In genetic interaction and scientific collaboration multiplexes for example, we find that 10% of ground-truth information yields 70% accuracy, while 30% information allows for more than 90% accuracy.
△ Less
Submitted 21 February, 2023; v1 submitted 15 June, 2022;
originally announced June 2022.
-
Effective submodularity of influence maximization on temporal networks
Authors:
Sirag Erkol,
Dario Mazzilli,
Filippo Radicchi
Abstract:
We study influence maximization on temporal networks. This is a special setting where the influence function is not submodular, and there is no optimality guarantee for solutions achieved via greedy optimization. We perform an exhaustive analysis on both real and synthetic networks. We show that the influence function of randomly sampled sets of seeds often violates the necessary conditions for su…
▽ More
We study influence maximization on temporal networks. This is a special setting where the influence function is not submodular, and there is no optimality guarantee for solutions achieved via greedy optimization. We perform an exhaustive analysis on both real and synthetic networks. We show that the influence function of randomly sampled sets of seeds often violates the necessary conditions for submodularity. However, when sets of seeds are selected according to the greedy optimization strategy, the influence function behaves effectively as a submodular function. Specifically, violations of the necessary conditions for submodularity are never observed in real networks, and only rarely in synthetic ones. The direct comparison with exact solutions obtained via brute-force search indicate that the greedy strategy provides approximate solutions that are well within the optimality gap guaranteed for strictly submodular functions. Greedy optimization appears therefore an effective strategy for the maximization of influence on temporal networks.
△ Less
Submitted 2 September, 2022; v1 submitted 11 May, 2022;
originally announced May 2022.
-
The dynamic nature of percolation on networks with triadic interactions
Authors:
Hanlin Sun,
Filippo Radicchi,
Jürgen Kurths,
Ginestra Bianconi
Abstract:
Percolation establishes the connectivity of complex networks and is one of the most fundamental critical phenomena for the study of complex systems. On simple networks, percolation displays a second-order phase transition; on multiplex networks, the percolation transition can become discontinuous. However, little is known about percolation in networks with higher-order interactions. Here, we show…
▽ More
Percolation establishes the connectivity of complex networks and is one of the most fundamental critical phenomena for the study of complex systems. On simple networks, percolation displays a second-order phase transition; on multiplex networks, the percolation transition can become discontinuous. However, little is known about percolation in networks with higher-order interactions. Here, we show that percolation can be turned into a fully-fledged dynamical process when higher-order interactions are taken into account. By introducing signed triadic interactions, in which a node can regulate the interactions between two other nodes, we define triadic percolation. We uncover that in this paradigmatic model the connectivity of the network changes in time and that the order parameter undergoes a period-doubling and a route to chaos. We provide a general theory for triadic percolation which accurately predicts the full phase diagram on random graphs as confirmed by extensive numerical simulations. We find that triadic percolation on real network topologies reveals a similar phenomenology. These results radically change our understanding of percolation and may be used to study complex systems in which the functional connectivity is changing in time dynamically and in a non-trivial way, such as in neural and climate networks.
△ Less
Submitted 11 March, 2023; v1 submitted 23 April, 2022;
originally announced April 2022.
-
Universality, criticality and complexity of information propagation in social media
Authors:
Daniele Notarmuzi,
Claudio Castellano,
Alessandro Flammini,
Dario Mazzilli,
Filippo Radicchi
Abstract:
Information avalanches in social media are typically studied in a similar fashion as avalanches of neuronal activity in the brain. Whereas a large body of literature reveals substantial agreement about the existence of a unique process characterizing neuronal activity across organisms, the dynamics of information in online social media is far less understood. Statistical laws of information avalan…
▽ More
Information avalanches in social media are typically studied in a similar fashion as avalanches of neuronal activity in the brain. Whereas a large body of literature reveals substantial agreement about the existence of a unique process characterizing neuronal activity across organisms, the dynamics of information in online social media is far less understood. Statistical laws of information avalanches are found in previous studies to be not robust across systems, and radically different processes are used to represent plausible driving mechanisms for information propagation. Here, we analyze almost 1 billion time-stamped events collected from a multitude of online platforms -- including Telegram, Twitter and Weibo -- over observation windows longer than 10 years to show that the propagation of information in social media is a universal and critical process. Universality arises from the observation of identical macroscopic patterns across platforms, irrespective of the details of the specific system at hand. Critical behavior is deduced from the power-law distributions, and corresponding hyperscaling relations, characterizing size and duration of avalanches of information. Neuronal activity may be modeled as a simple contagion process, where only a single exposure to activity may be sufficient for its diffusion. On the contrary, statistical testing on our data indicates that a mixture of simple and complex contagion, where involvement of an individual requires exposure from multiple acquaintances, characterizes the propagation of information in social media. We show that the complexity of the process is correlated with the semantic content of the information that is propagated. Conversational topics about music, movies and TV shows tend to propagate as simple contagion processes, whereas controversial discussions on political/societal themes obey the rules of complex contagion.
△ Less
Submitted 6 October, 2021; v1 submitted 31 August, 2021;
originally announced September 2021.
-
Systematic comparison of graph embedding methods in practical tasks
Authors:
Yi-Jiao Zhang,
Kai-Cheng Yang,
Filippo Radicchi
Abstract:
Network embedding techniques aim at representing structural properties of graphs in geometric space. Those representations are considered useful in downstream tasks such as link prediction and clustering. However, the number of graph embedding methods available on the market is large, and practitioners face the non-trivial choice of selecting the proper approach for a given application. The presen…
▽ More
Network embedding techniques aim at representing structural properties of graphs in geometric space. Those representations are considered useful in downstream tasks such as link prediction and clustering. However, the number of graph embedding methods available on the market is large, and practitioners face the non-trivial choice of selecting the proper approach for a given application. The present work attempts to close this gap of knowledge through a systematic comparison of eleven different methods for graph embedding. We consider methods for embedding networks in the hyperbolic and Euclidean metric spaces, as well as non-metric community-based embedding methods. We apply these methods to embed more than one hundred real-world and synthetic networks. Three common downstream tasks -- mapping accuracy, greedy routing, and link prediction -- are considered to evaluate the quality of the various embedding methods. Our results show that some Euclidean embedding methods excel in greedy routing. As for link prediction, community-based and hyperbolic embedding methods yield overall performance superior than that of Euclidean-space-based approaches. We compare the running time for different methods and further analyze the impact of different network characteristics such as degree distribution, modularity, and clustering coefficients on the quality of the different embedding methods. We release our evaluation framework to provide a standardized benchmark for arbitrary embedding methods.
△ Less
Submitted 18 June, 2021;
originally announced June 2021.
-
Percolation theory of self-exciting temporal processes
Authors:
Daniele Notarmuzi,
Claudio Castellano,
Alessandro Flammini,
Dario Mazzilli,
Filippo Radicchi
Abstract:
We investigate how the properties of inhomogeneous patterns of activity, appearing in many natural and social phenomena, depend on the temporal resolution used to define individual bursts of activity. To this end, we consider time series of microscopic events produced by a self-exciting Hawkes process, and leverage a percolation framework to study the formation of macroscopic bursts of activity as…
▽ More
We investigate how the properties of inhomogeneous patterns of activity, appearing in many natural and social phenomena, depend on the temporal resolution used to define individual bursts of activity. To this end, we consider time series of microscopic events produced by a self-exciting Hawkes process, and leverage a percolation framework to study the formation of macroscopic bursts of activity as a function of the resolution parameter. We find that the very same process may result in different distributions of avalanche size and duration, which are understood in terms of the competition between the 1D percolation and the branching process universality class. Pure regimes for the individual classes are observed at specific values of the resolution parameter corresponding to the critical points of the percolation diagram. A regime of crossover characterized by a mixture of the two universal behaviors is observed in a wide region of the diagram. The hybrid scaling appears to be a likely outcome for an analysis of the time series based on a reasonably chosen, but not precisely adjusted, value of the resolution parameter.
△ Less
Submitted 24 February, 2021; v1 submitted 4 February, 2021;
originally announced February 2021.
-
Combinatorial approach to spreading processes on networks
Authors:
Dario Mazzilli,
Filippo Radicchi
Abstract:
Stochastic spreading models defined on complex network topologies are used to mimic the diffusion of diseases, information, and opinions in real-world systems. Existing theoretical approaches to the characterization of the models in terms of microscopic configurations rely on some approximation of independence among dynamical variables, thus introducing a systematic bias in the prediction of the g…
▽ More
Stochastic spreading models defined on complex network topologies are used to mimic the diffusion of diseases, information, and opinions in real-world systems. Existing theoretical approaches to the characterization of the models in terms of microscopic configurations rely on some approximation of independence among dynamical variables, thus introducing a systematic bias in the prediction of the ground-truth dynamics. Here, we develop a combinatorial framework based on the approximation that spreading may occur only along the shortest paths connecting pairs of nodes. The approximation overestimates dynamical correlations among node states and leads to biased predictions. Systematic bias is, however, pointing in the opposite direction of existing approximations. We show that the combination of the two biased approaches generates predictions of the ground-truth dynamics that are more accurate than the ones given by the two approximations if used in isolation. We further take advantage of the combinatorial approximation to characterize theoretical properties of some inference problems, and show that the reconstruction of microscopic configurations is very sensitive to both the place where and the time when partial knowledge of the system is acquired.
△ Less
Submitted 6 January, 2021;
originally announced January 2021.
-
Who is the best coach of all time? A network-based assessment of the career performance of professional sports coaches
Authors:
Şirag Erkol,
Filippo Radicchi
Abstract:
We consider two large datasets consisting of all games played among top-tier European soccer clubs in the last 60 years, and among professional American basketball teams in the past 70 years. We leverage game data to build networks of pairwise interactions between the head coaches of the teams, and measure their career performance in terms of network centrality metrics. We identify Arsene Wenger,…
▽ More
We consider two large datasets consisting of all games played among top-tier European soccer clubs in the last 60 years, and among professional American basketball teams in the past 70 years. We leverage game data to build networks of pairwise interactions between the head coaches of the teams, and measure their career performance in terms of network centrality metrics. We identify Arsene Wenger, Sir Alex Ferguson, Jupp Heynckes, Carlo Ancelotti, and Jose Mourinho as the top 5 European soccer coaches of all time. In American basketball, the first 5 positions of the all-time ranking are occupied by Red Auerbach, Gregg Popovich, Phil Jackson, Don Nelson, and Lenny Wilkens. We further establish rankings by decade and season. We develop a simple methodology to monitor performance throughout a coach's career, and to dynamically compare the performance of two or more coaches at a given time. The manuscript is accompanied by the website coachscore.luddy.indiana.edu where complete results of our analysis are accessible to the interested readers.
△ Less
Submitted 26 April, 2021; v1 submitted 16 December, 2020;
originally announced December 2020.
-
Detecting climate teleconnections with Granger causality
Authors:
Filipi N Silva,
Didier A. Vega-Oliveros,
Xiaoran Yan,
Alessandro Flammini,
Filippo Menczer,
Filippo Radicchi,
Ben Kravitz,
Santo Fortunato
Abstract:
Climate system teleconnections are crucial for improving climate predictability, but difficult to quantify. Standard approaches to identify teleconnections are often based on correlations between time series. Here we present a novel method leveraging Granger causality, which can infer/detect relationships between any two fields. We compare teleconnections identified by correlation and Granger caus…
▽ More
Climate system teleconnections are crucial for improving climate predictability, but difficult to quantify. Standard approaches to identify teleconnections are often based on correlations between time series. Here we present a novel method leveraging Granger causality, which can infer/detect relationships between any two fields. We compare teleconnections identified by correlation and Granger causality at different timescales. We find that both Granger causality and correlation consistently recover known seasonal precipitation responses to the sea surface temperature pattern associated with the El Niño Southern Oscillation. Such findings are robust across multiple time resolutions. In addition, we identify candidates for unexplored teleconnection responses.
△ Less
Submitted 28 September, 2021; v1 submitted 16 November, 2020;
originally announced December 2020.
-
Model-free hidden geometry of complex networks
Authors:
Yi-Jiao Zhang,
Kai-Cheng Yang,
Filippo Radicchi
Abstract:
The fundamental idea of embedding a network in a metric space is rooted in the principle of proximity preservation. Nodes are mapped into points of the space with pairwise distance that reflects their proximity in the network. Popular methods employed in network embedding either rely on implicit approximations of the principle of proximity preservation or implement it by enforcing the geometry of…
▽ More
The fundamental idea of embedding a network in a metric space is rooted in the principle of proximity preservation. Nodes are mapped into points of the space with pairwise distance that reflects their proximity in the network. Popular methods employed in network embedding either rely on implicit approximations of the principle of proximity preservation or implement it by enforcing the geometry of the embedding space, thus hindering geometric properties that networks may spontaneously exhibit. Here, we take advantage of a model-free embedding method explicitly devised for preserving pairwise proximity, and characterize the geometry emerging from the mapping of several networks, both real and synthetic. We show that the learned embedding has simple and intuitive interpretations: the distance of a node from the geometric center is representative for its closeness centrality, and the relative positions of nodes reflect the community structure of the network. Proximity can be preserved in relatively low-dimensional embedding spaces, and the hidden geometry displays optimal performance in guiding greedy navigation regardless of the specific network topology. We finally show that the mapping provides a natural description of contagion processes on networks, with complex spatiotemporal patterns represented by waves propagating from the geometric center to the periphery. The findings deepen our understanding of the model-free hidden geometry of complex networks.
△ Less
Submitted 14 January, 2021; v1 submitted 16 November, 2020;
originally announced November 2020.
-
Influence maximization on temporal networks
Authors:
Sirag Erkol,
Dario Mazzilli,
Filippo Radicchi
Abstract:
We consider the optimization problem of seeding a spreading process on a temporal network so that the expected size of the resulting outbreak is maximized. We frame the problem for a spreading process following the rules of the susceptible-infected-recovered model with temporal scale equal to the one characterizing the evolution of the network topology. We perform a systematic analysis based on a…
▽ More
We consider the optimization problem of seeding a spreading process on a temporal network so that the expected size of the resulting outbreak is maximized. We frame the problem for a spreading process following the rules of the susceptible-infected-recovered model with temporal scale equal to the one characterizing the evolution of the network topology. We perform a systematic analysis based on a corpus of 12 real-world temporal networks and quantify the performance of solutions to the influence maximization problem obtained using different level of information about network topology and dynamics. We find that having perfect knowledge of the network topology but in a static and/or aggregated form is not helpful in solving the influence maximization problem effectively. Knowledge, even if partial, of the early stages of the network dynamics appears instead essential for the identification of quasioptimal sets of influential spreaders.
△ Less
Submitted 19 October, 2020;
originally announced October 2020.
-
Community detection in networks using graph embeddings
Authors:
Aditya Tandon,
Aiiad Albeshri,
Vijey Thayananthan,
Wadee Alhalabi,
Filippo Radicchi,
Santo Fortunato
Abstract:
Graph embedding methods are becoming increasingly popular in the machine learning community, where they are widely used for tasks such as node classification and link prediction. Embedding graphs in geometric spaces should aid the identification of network communities as well, because nodes in the same community should be projected close to each other in the geometric space, where they can be dete…
▽ More
Graph embedding methods are becoming increasingly popular in the machine learning community, where they are widely used for tasks such as node classification and link prediction. Embedding graphs in geometric spaces should aid the identification of network communities as well, because nodes in the same community should be projected close to each other in the geometric space, where they can be detected via standard data clustering algorithms. In this paper, we test the ability of several graph embedding techniques to detect communities on benchmark graphs. We compare their performance against that of traditional community detection algorithms. We find that the performance is comparable, if the parameters of the embedding techniques are suitably chosen. However, the optimal parameter set varies with the specific features of the benchmark graphs, like their size, whereas popular community detection algorithms do not require any parameter. So it is not possible to indicate beforehand good parameter sets for the analysis of real networks. This finding, along with the high computational cost of embedding a network and grouping the points, suggests that, for community detection, current embedding techniques do not represent an improvement over network clustering algorithms.
△ Less
Submitted 5 March, 2021; v1 submitted 11 September, 2020;
originally announced September 2020.
-
Epidemic plateau in critical SIR dynamics with non-trivial initial conditions
Authors:
Filippo Radicchi,
Ginestra Bianconi
Abstract:
Containment measures implemented by some countries to suppress the spread of COVID-19 have resulted in a slowdown of the epidemic characterized by time series of daily infections plateauing over extended periods of time. We prove that such a dynamical pattern is compatible with critical Susceptible-Infected-Removed (SIR) dynamics. In traditional analyses of the critical SIR model, the critical dyn…
▽ More
Containment measures implemented by some countries to suppress the spread of COVID-19 have resulted in a slowdown of the epidemic characterized by time series of daily infections plateauing over extended periods of time. We prove that such a dynamical pattern is compatible with critical Susceptible-Infected-Removed (SIR) dynamics. In traditional analyses of the critical SIR model, the critical dynamical regime is started from a single infected node. The application of containment measures to an ongoing epidemic, however, has the effect to make the system enter in its critical regime with a number of infected individuals potentially large. We describe how such non-trivial starting conditions affect the critical behavior of the SIR model. We perform a theoretical and large-scale numerical investigation of the model. We show that the expected outbreak size is an increasing function of the initial number of infected individuals, while the expected duration of the outbreak is a non-monotonic function of the initial number of infected individuals. Also, we precisely characterize the magnitude of the fluctuations associated with the size and duration of the outbreak in critical SIR dynamics with non-trivial initial conditions. Far from heard immunity, fluctuations are much larger than average values, thus indicating that predictions of plateauing time series may be particularly challenging.
△ Less
Submitted 21 October, 2020; v1 submitted 29 July, 2020;
originally announced July 2020.
-
Principled approach to the selection of the embedding dimension of networks
Authors:
Weiwei Gu,
Aditya Tandon,
Yong-Yeol Ahn,
Filippo Radicchi
Abstract:
Network embedding is a general-purpose machine learning technique that encodes network structure in vector spaces with tunable dimension. Choosing an appropriate embedding dimension -- small enough to be efficient and large enough to be effective -- is challenging but necessary to generate embeddings applicable to a multitude of tasks. Existing strategies for the selection of the embedding dimensi…
▽ More
Network embedding is a general-purpose machine learning technique that encodes network structure in vector spaces with tunable dimension. Choosing an appropriate embedding dimension -- small enough to be efficient and large enough to be effective -- is challenging but necessary to generate embeddings applicable to a multitude of tasks. Existing strategies for the selection of the embedding dimension rely on performance maximization in downstream tasks. Here, we propose a principled method such that all structural information of a network is parsimoniously encoded. The method is validated on various embedding algorithms and a large corpus of real-world networks. The embedding dimension selected by our method in real-world networks suggest that efficient encoding in low-dimensional spaces is usually possible.
△ Less
Submitted 18 June, 2021; v1 submitted 21 April, 2020;
originally announced April 2020.
-
Classes of critical avalanche dynamics in complex networks
Authors:
Filippo Radicchi,
Claudio Castellano,
Alessandro Flammini,
Miguel A. Muñoz,
Daniele Notarmuzi
Abstract:
Dynamical processes exhibiting absorbing states are essential in the modeling of a large variety of situations from material science to epidemiology and social sciences. Such processes exhibit the possibility of avalanching behavior upon slow driving. Here, we study the distribution of sizes and durations of avalanches for well-known dynamical processes on complex networks. We find that all analyz…
▽ More
Dynamical processes exhibiting absorbing states are essential in the modeling of a large variety of situations from material science to epidemiology and social sciences. Such processes exhibit the possibility of avalanching behavior upon slow driving. Here, we study the distribution of sizes and durations of avalanches for well-known dynamical processes on complex networks. We find that all analyzed models display a similar critical behavior, characterized by the presence of two distinct regimes. At small scales, sizes and durations of avalanches exhibit distributions that are dependent on the network topology and the model dynamics. At asymptotically large scales instead -- irrespective of the type of dynamics and of the topology of the underlying network -- sizes and durations of avalanches are characterized by power-law distributions with the exponents of the standard mean-field critical branching process.
△ Less
Submitted 31 July, 2020; v1 submitted 26 February, 2020;
originally announced February 2020.
-
k-core structure of real multiplex networks
Authors:
Saeed Osat,
Filippo Radicchi,
Fragkiskos Papadopoulos
Abstract:
Multiplex networks are convenient mathematical representations for many real-world -- biological, social, and technological -- systems of interacting elements, where pairwise interactions among elements have different flavors. Previous studies pointed out that real-world multiplex networks display significant inter-layer correlations -- degree-degree correlation, edge overlap, node similarities --…
▽ More
Multiplex networks are convenient mathematical representations for many real-world -- biological, social, and technological -- systems of interacting elements, where pairwise interactions among elements have different flavors. Previous studies pointed out that real-world multiplex networks display significant inter-layer correlations -- degree-degree correlation, edge overlap, node similarities -- able to make them robust against random and targeted failures of their individual components. Here, we show that inter-layer correlations are important also in the characterization of their $\mathbf{k}$-core structure, namely the organization in shells of nodes with increasingly high degree. Understanding $k$-core structures is important in the study of spreading processes taking place on networks, as for example in the identification of influential spreaders and the emergence of localization phenomena. We find that, if the degree distribution of the network is heterogeneous, then a strong $\mathbf{k}$-core structure is well predicted by significantly positive degree-degree correlations. However, if the network degree distribution is homogeneous, then strong $\mathbf{k}$-core structure is due to positive correlations at the level of node similarities. We reach our conclusions by analyzing different real-world multiplex networks, introducing novel techniques for controlling inter-layer correlations of networks without changing their structure, and taking advantage of synthetic network models with tunable levels of inter-layer correlations.
△ Less
Submitted 24 May, 2020; v1 submitted 25 November, 2019;
originally announced November 2019.
-
Classical Information Theory of Networks
Authors:
Filippo Radicchi,
Dmitri Krioukov,
Harrison Hartle,
Ginestra Bianconi
Abstract:
Existing information-theoretic frameworks based on maximum entropy network ensembles are not able to explain the emergence of heterogeneity in complex networks. Here, we fill this gap of knowledge by developing a classical framework for networks based on finding an optimal trade-off between the information content of a compressed representation of the ensemble and the information content of the ac…
▽ More
Existing information-theoretic frameworks based on maximum entropy network ensembles are not able to explain the emergence of heterogeneity in complex networks. Here, we fill this gap of knowledge by developing a classical framework for networks based on finding an optimal trade-off between the information content of a compressed representation of the ensemble and the information content of the actual network ensemble. In this way not only we introduce a novel classical network ensemble satisfying a set of soft constraints but we are also able to calculate the optimal distribution of the constraints. We show that for the classical network ensemble in which the only constraints are the expected degrees a power-law degree distribution is optimal. Also, we study spatially embedded networks finding that the interactions between nodes naturally lead to non-uniform spread of nodes in the space, with pairs of nodes at a given distance not necessarily obeying a power-law distribution. The pertinent features of real-world air transportation networks are well described by the proposed framework.
△ Less
Submitted 14 May, 2020; v1 submitted 10 August, 2019;
originally announced August 2019.
-
Systematic comparison between methods for the detection of influential spreaders in complex networks
Authors:
Sirag Erkol,
Claudio Castellano,
Filippo Radicchi
Abstract:
Influence maximization is the problem of finding the set of nodes of a network that maximizes the size of the outbreak of a spreading process occurring on the network. Solutions to this problem are important for strategic decisions in marketing and political campaigns. The typical setting consists in the identification of small sets of initial spreaders in very large networks. This setting makes t…
▽ More
Influence maximization is the problem of finding the set of nodes of a network that maximizes the size of the outbreak of a spreading process occurring on the network. Solutions to this problem are important for strategic decisions in marketing and political campaigns. The typical setting consists in the identification of small sets of initial spreaders in very large networks. This setting makes the optimization problem computationally infeasible for standard greedy optimization algorithms that account simultaneously for information about network topology and spreading dynamics, leaving space only to heuristic methods based on the drastic approximation of relying on the geometry of the network alone. The literature on the subject is plenty of purely topological methods for the identification of influential spreaders in networks. However, it is unclear how far these methods are from being optimal. Here, we perform a systematic test of the performance of a multitude of heuristic methods for the identification of influential spreaders. We quantify the performance of the various methods on a corpus of 100 real-world networks; the corpus consists of networks small enough for the application of greedy optimization so that results from this algorithm are used as the baseline needed for the analysis of the performance of the other methods on the same corpus of networks. We find that relatively simple network metrics, such as adaptive degree or closeness centralities, are able to achieve performances very close to the baseline value, thus providing good support for the use of these metrics in large-scale problem settings....
△ Less
Submitted 22 October, 2019; v1 submitted 17 April, 2019;
originally announced April 2019.
-
Error-Correcting Decoders for Communities in Networks
Authors:
Krishna C. Bathina,
Filippo Radicchi
Abstract:
As recent work demonstrated, the task of identifying communities in networks can be considered analogous to the classical problem of decoding messages transmitted along a noisy channel. We leverage this analogy to develop a community detection method directly inspired by a standard and widely-used decoding technique. We further simplify the algorithm to reduce the time complexity from quadratic to…
▽ More
As recent work demonstrated, the task of identifying communities in networks can be considered analogous to the classical problem of decoding messages transmitted along a noisy channel. We leverage this analogy to develop a community detection method directly inspired by a standard and widely-used decoding technique. We further simplify the algorithm to reduce the time complexity from quadratic to linear. We test the performance of the original and reduced versions of the algorithm on artificial benchmarks with pre-imposed community structure, and on real networks with annotated community structure. Results of our systematic analysis indicate that the proposed techniques are able to provide satisfactory results.
△ Less
Submitted 3 February, 2019;
originally announced February 2019.
-
Characterizing the analogy between hyperbolic embedding and community structure of complex networks
Authors:
Ali Faqeeh,
Saeed Osat,
Filippo Radicchi
Abstract:
We show that the community structure of a network can be used as a coarse version of its embedding in a hidden space with hyperbolic geometry. The finding emerges from a systematic analysis of several real-world and synthetic networks. We take advantage of the analogy for reinterpreting results originally obtained through network hyperbolic embedding in terms of community structure only. First, we…
▽ More
We show that the community structure of a network can be used as a coarse version of its embedding in a hidden space with hyperbolic geometry. The finding emerges from a systematic analysis of several real-world and synthetic networks. We take advantage of the analogy for reinterpreting results originally obtained through network hyperbolic embedding in terms of community structure only. First, we show that the robustness of a multiplex network can be controlled by tuning the correlation between the community structures across different layers. Second, we deploy an efficient greedy protocol for network navigability that makes use of routing tables based on community structure.
△ Less
Submitted 29 August, 2018;
originally announced August 2018.
-
Weight Thresholding on Complex Networks
Authors:
Xiaoran Yan,
Lucas G. S. Jeub,
Alessandro Flammini,
Filippo Radicchi,
Santo Fortunato
Abstract:
Weight thresholding is a simple technique that aims at reducing the number of edges in weighted networks that are otherwise too dense for the application of standard graph theoretical methods. We show that the group structure of real weighted networks is very robust under weight thresholding, as it is maintained even when most of the edges are removed. This appears to be related to the correlation…
▽ More
Weight thresholding is a simple technique that aims at reducing the number of edges in weighted networks that are otherwise too dense for the application of standard graph theoretical methods. We show that the group structure of real weighted networks is very robust under weight thresholding, as it is maintained even when most of the edges are removed. This appears to be related to the correlation between topology and weight that characterizes real networks. On the other hand, the behavior of other properties is generally system dependent.
△ Less
Submitted 5 October, 2018; v1 submitted 19 June, 2018;
originally announced June 2018.
-
Controlling the uncertain response of real multiplex networks to random damage
Authors:
Francesco Coghi,
Filippo Radicchi,
Ginestra Bianconi
Abstract:
We reveal large fluctuations in the response of real multiplex networks to random damage of nodes. These results indicate that the average response to random damage, traditionally considered in mean-field approaches to percolation, is a poor metric of system robustness. We show instead that a large deviation approach to percolation provides a more accurate characterization of system robustness. We…
▽ More
We reveal large fluctuations in the response of real multiplex networks to random damage of nodes. These results indicate that the average response to random damage, traditionally considered in mean-field approaches to percolation, is a poor metric of system robustness. We show instead that a large deviation approach to percolation provides a more accurate characterization of system robustness. We identify an effective percolation threshold at which we observe a clear abrupt transition separating two distinct regimes in which the most likely response to damage is either a functional or a dismantled multiplex network. We leverage our findings to propose a new metric, named safeguard centrality, able to single out the nodes that control the response of the entire multiplex network to random damage. We show that safeguarding the function of top-scoring nodes is sufficient to prevent system collapse.
△ Less
Submitted 20 September, 2018; v1 submitted 2 May, 2018;
originally announced May 2018.
-
Influence maximization in noisy networks
Authors:
Şirag Erkol,
Ali Faqeeh,
Filippo Radicchi
Abstract:
We consider the problem of identifying the most influential nodes for a spreading process on a network when prior knowledge about structure and dynamics of the system is incomplete or erroneous. Specifically, we perform a numerical analysis where the set of top spreaders is determined on the basis of prior information that is artificially altered by a certain level of noise. We then measure the op…
▽ More
We consider the problem of identifying the most influential nodes for a spreading process on a network when prior knowledge about structure and dynamics of the system is incomplete or erroneous. Specifically, we perform a numerical analysis where the set of top spreaders is determined on the basis of prior information that is artificially altered by a certain level of noise. We then measure the optimality of the chosen set by measuring its spreading impact in the true system. Whereas we find that the identification of top spreaders is optimal when prior knowledge is complete and free of mistakes, we also find that the quality of the top spreaders identified using noisy information doesn't necessarily decrease as the noise level increases. For instance, we show that it is generally possible to compensate for erroneous information about dynamical parameters by adding synthetic errors in the structure of the network. Further, we show that, in some dynamical regimes, even completely losing prior knowledge on network structure may be better than relying on certain but incomplete information.
△ Less
Submitted 5 October, 2018; v1 submitted 6 March, 2018;
originally announced March 2018.
-
Decoding communities in networks
Authors:
Filippo Radicchi
Abstract:
According to a recent information-theoretical proposal, the problem of defining and identifying communities in networks can be interpreted as a classical communication task over a noisy channel: memberships of nodes are information bits erased by the channel, edges and non-edges in the network are parity bits introduced by the encoder but degraded through the channel, and a community identificatio…
▽ More
According to a recent information-theoretical proposal, the problem of defining and identifying communities in networks can be interpreted as a classical communication task over a noisy channel: memberships of nodes are information bits erased by the channel, edges and non-edges in the network are parity bits introduced by the encoder but degraded through the channel, and a community identification algorithm is a decoder. The interpretation is perfectly equivalent to the one at the basis of well-known statistical inference algorithms for community detection. The only difference in the interpretation is that a noisy channel replaces a stochastic network model. However, the different perspective gives the opportunity to take advantage of the rich set of tools of coding theory to generate novel insights on the problem of community detection. In this paper, we illustrate two main applications of standard coding-theoretical methods to community detection. First, we leverage a state-of-the-art decoding technique to generate a family of quasi-optimal community detection algorithms. Second and more important, we show that the Shannon's noisy-channel coding theorem can be invoked to establish a lower bound, here named as decodability bound, for the maximum amount of noise tolerable by an ideal decoder to achieve perfect detection of communities. When computed for well-established synthetic benchmarks, the decodability bound explains accurately the performance achieved by the best community detection algorithms existing on the market, telling us that only little room for their improvement is still potentially left.
△ Less
Submitted 1 March, 2018; v1 submitted 14 November, 2017;
originally announced November 2017.
-
Optimal percolation on multiplex networks
Authors:
Saeed Osat,
Ali Faqeeh,
Filippo Radicchi
Abstract:
Optimal percolation is the problem of finding the minimal set of nodes such that if the members of this set are removed from a network, the network is fragmented into non-extensive disconnected clusters. The solution of the optimal percolation problem has direct applicability in strategies of immunization in disease spreading processes, and influence maximization for certain classes of opinion dyn…
▽ More
Optimal percolation is the problem of finding the minimal set of nodes such that if the members of this set are removed from a network, the network is fragmented into non-extensive disconnected clusters. The solution of the optimal percolation problem has direct applicability in strategies of immunization in disease spreading processes, and influence maximization for certain classes of opinion dynamical models. In this paper, we consider the problem of optimal percolation on multiplex networks. The multiplex scenario serves to realistically model various technological, biological, and social networks. We find that the multilayer nature of these systems, and more precisely multiplex characteristics such as edge overlap and interlayer degree-degree correlation, profoundly changes the properties of the set of nodes identified as the solution of the optimal percolation problem.
△ Less
Submitted 5 July, 2017;
originally announced July 2017.
-
Uncertainty Reduction for Stochastic Processes on Complex Networks
Authors:
Filippo Radicchi,
Claudio Castellano
Abstract:
Many real-world systems are characterized by stochastic dynamical rules where a complex network of interactions among individual elements probabilistically determines their state. Even with full knowledge of the network structure and of the stochastic rules, the ability to predict system configurations is generally characterized by a large uncertainty. Selecting a fraction of the nodes and observi…
▽ More
Many real-world systems are characterized by stochastic dynamical rules where a complex network of interactions among individual elements probabilistically determines their state. Even with full knowledge of the network structure and of the stochastic rules, the ability to predict system configurations is generally characterized by a large uncertainty. Selecting a fraction of the nodes and observing their state may help to reduce the uncertainty about the unobserved nodes. However, choosing these points of observation in an optimal way is a highly nontrivial task, depending on the nature of the stochastic process and on the structure of the underlying interaction pattern. In this paper, we introduce a computationally efficient algorithm to determine quasioptimal solutions to the problem. The method leverages network sparsity to reduce computational complexity from exponential to almost quadratic, thus allowing the straightforward application of the method to mid-to-large-size systems. Although the method is exact only for equilibrium stochastic processes defined on trees, it turns out to be effective also for out-of-equilibrium processes on sparse loopy networks.
△ Less
Submitted 11 May, 2018; v1 submitted 10 March, 2017;
originally announced March 2017.
-
Observability transition in multiplex networks
Authors:
Saeed Osat,
Filippo Radicchi
Abstract:
We extend the observability model to multiplex networks. We present mathematical frameworks, valid under the treelike ansatz, able to describe the emergence of the macroscopic cluster of mutually observable nodes in both synthetic and real-world multiplex networks. We show that the observability transition in synthetic multiplex networks is discontinuous. In real-world multiplex networks instead,…
▽ More
We extend the observability model to multiplex networks. We present mathematical frameworks, valid under the treelike ansatz, able to describe the emergence of the macroscopic cluster of mutually observable nodes in both synthetic and real-world multiplex networks. We show that the observability transition in synthetic multiplex networks is discontinuous. In real-world multiplex networks instead, edge overlap among layers is responsible for the disappearance of any sign of abruptness in the emergence of the the macroscopic cluster of mutually observable nodes.
△ Less
Submitted 15 January, 2017;
originally announced January 2017.
-
Quantifying perceived impact of scientific publications
Authors:
Filippo Radicchi,
Alexander Weissman,
Johan Bollen
Abstract:
Citations are commonly held to represent scientific impact. To date, however, there is no empirical evidence in support of this postulate that is central to research assessment exercises and Science of Science studies. Here, we report on the first empirical verification of the degree to which citation numbers represent scientific impact as it is actually perceived by experts in their respective fi…
▽ More
Citations are commonly held to represent scientific impact. To date, however, there is no empirical evidence in support of this postulate that is central to research assessment exercises and Science of Science studies. Here, we report on the first empirical verification of the degree to which citation numbers represent scientific impact as it is actually perceived by experts in their respective field. We run a large-scale survey of about 2000 corresponding authors who performed a pairwise impact assessment task across more than 20000 scientific articles. Results of the survey show that citation data and perceived impact do not align well, unless one properly accounts for strong psychological biases that affect the opinions of experts with respect to their own papers vs. those of others. First, researchers tend to largely prefer their own publications to the most cited papers in their field of research. Second, there is only a mild positive correlation between the number of citations of top-cited papers in given research areas and expert preference in pairwise comparisons. This also applies to pairs of papers with several orders of magnitude differences in their total number of accumulated citations. However, when researchers were asked to choose among pairs of their own papers, thus eliminating the bias favouring one's own papers over those of others, they did systematically prefer the most cited article. We conclude that, when scientists have full information and are making unbiased choices, expert opinion on impact is congruent with citation numbers.
△ Less
Submitted 12 December, 2016;
originally announced December 2016.
-
Percolation in real multiplex networks
Authors:
Ginestra Bianconi,
Filippo Radicchi
Abstract:
We present an exact mathematical framework able to describe site-percolation transitions in real multiplex networks. Specifically, we consider the average percolation diagram valid over an infinite number of random configurations where nodes are present in the system with given probability. The approach relies on the locally treelike ansatz, so that it is expected to accurately reproduce the true…
▽ More
We present an exact mathematical framework able to describe site-percolation transitions in real multiplex networks. Specifically, we consider the average percolation diagram valid over an infinite number of random configurations where nodes are present in the system with given probability. The approach relies on the locally treelike ansatz, so that it is expected to accurately reproduce the true percolation diagram of sparse multiplex networks with negligible number of short loops. The performance of our theory is tested in social, biological, and transportation multiplex graphs. When compared against previously introduced methods, we observe improvements in the prediction of the percolation diagrams in all networks analyzed. Results from our method confirm previous claims about the robustness of real multiplex networks, in the sense that the average connectedness of the system does not exhibit any significant abrupt change as its individual components are randomly destroyed.
△ Less
Submitted 27 October, 2016;
originally announced October 2016.
-
Redundant interdependencies boost the robustness of multilayer networks
Authors:
Filippo Radicchi,
Ginestra Bianconi
Abstract:
In the analysis of the robustness of multiplex networks, it is commonly assumed that a node is functioning only if its interdependent nodes are simultaneously functioning. According to this model, a multiplex network becomes more and more fragile as the number of layers increases. In this respect, the addition of a new layer of interdependent nodes to a preexisting multiplex network will never imp…
▽ More
In the analysis of the robustness of multiplex networks, it is commonly assumed that a node is functioning only if its interdependent nodes are simultaneously functioning. According to this model, a multiplex network becomes more and more fragile as the number of layers increases. In this respect, the addition of a new layer of interdependent nodes to a preexisting multiplex network will never improve its robustness. Whereas such a model seems appropriate to understand the effect of interdependencies in the simplest scenario of a network composed of only two layers, it may seem not suitable to characterize the robustness of real systems formed by multiple network layers. It seems in fact unrealistic that a real system, evolved, through the development of multiple layers of interactions, towards a fragile structure. In this paper, we introduce a model of percolation where the condition that makes a node functional is that the node is functioning in at least two of the layers of the network. The model reduces to the commonly adopted percolation model for multiplex networks when the number of layers equals two. For larger number of layers, however, the model describes a scenario where the addition of new layers boosts the robustness of the system by creating redundant interdependencies among layers. We prove this fact thanks to the development of a message-passing theory able to characterize the model in both synthetic and real-world multiplex graphs.
△ Less
Submitted 9 March, 2017; v1 submitted 17 October, 2016;
originally announced October 2016.
-
Fundamental difference between superblockers and superspreaders in networks
Authors:
Filippo Radicchi,
Claudio Castellano
Abstract:
Two very important problems regarding spreading phenomena in complex topologies are the optimal selection of node sets either to minimize or maximize the extent of outbreaks. Both problems are nontrivial when a small fraction of the nodes in the network can be used to achieve the desired goal. The minimization problem is equivalent to a structural optimization. The "superblockers", i.e., the nodes…
▽ More
Two very important problems regarding spreading phenomena in complex topologies are the optimal selection of node sets either to minimize or maximize the extent of outbreaks. Both problems are nontrivial when a small fraction of the nodes in the network can be used to achieve the desired goal. The minimization problem is equivalent to a structural optimization. The "superblockers", i.e., the nodes that should be removed from the network to minimize the size of outbreaks, are those nodes that make connected components as small as possible. "Superspreaders" are instead the nodes such that, if chosen as initiators, they maximize the average size of outbreaks. The identity of superspreaders is expected to depend not just on the topology, but also on the specific dynamics considered. Recently, it has been conjectured that the two optimization problems might be equivalent, in the sense that superblockers act also as superspreaders. In spite of its potential groundbreaking importance, no empirical study has been performed to validate this conjecture. In this paper, we perform an extensive analysis over a large set of real-world networks to test the similarity between sets of superblockers and of superspreaders. We show that the two optimization problems are not equivalent: superblockers do not act as optimal spreaders.
△ Less
Submitted 20 January, 2017; v1 submitted 10 October, 2016;
originally announced October 2016.
-
Observability transition in real networks
Authors:
Yang Yang,
Filippo Radicchi
Abstract:
We consider the observability model in networks with arbitrary topologies. We introduce a system of coupled nonlinear equations, valid under the locally tree-like ansatz, to describe the size of the largest observable cluster as a function of the fraction of directly observable nodes present in the network. We perform a systematic analysis on 95 real-world graphs and compare our theoretical predic…
▽ More
We consider the observability model in networks with arbitrary topologies. We introduce a system of coupled nonlinear equations, valid under the locally tree-like ansatz, to describe the size of the largest observable cluster as a function of the fraction of directly observable nodes present in the network. We perform a systematic analysis on 95 real-world graphs and compare our theoretical predictions with numerical simulations of the observability model. Our method provides almost perfect predictions in the majority of the cases, even for networks with very large values of the clustering coefficient. Potential applications of our theory include the development of efficient and scalable algorithms for real-time surveillance of social networks, and monitoring of technological networks.
△ Less
Submitted 24 July, 2016;
originally announced July 2016.
-
Citation success index - An intuitive pair-wise journal comparison metric
Authors:
Staša Milojević,
Filippo Radicchi,
Judit Bar-Ilan
Abstract:
In this paper we present "citation success index", a metric for comparing the citation capacity of pairs of journals. Citation success index is the probability that a random paper in one journal has more citations than a random paper in another journal (50% means the two journals do equally well). Unlike the journal impact factor (IF), the citation success index depends on the broadness and the sh…
▽ More
In this paper we present "citation success index", a metric for comparing the citation capacity of pairs of journals. Citation success index is the probability that a random paper in one journal has more citations than a random paper in another journal (50% means the two journals do equally well). Unlike the journal impact factor (IF), the citation success index depends on the broadness and the shape of citation distributions. Also, it is insensitive to sporadic highly-cited papers that skew the IF. Nevertheless, we show, based on 16,000 journals containing ~2.4 million articles, that the citation success index is a relatively tight function of the ratio of IFs of journals being compared, due to the fact that journals with same IF have quite similar citation distributions. The citation success index grows slowly as a function of IF ratio. It is substantial (>90%) only when the ratio of IFs exceeds ~6, whereas a factor of two difference in IF values translates into a modest advantage for the journal with higher IF (index of ~70%). We facilitate the wider adoption of this metric by providing an online calculator that takes as input parameters only the IFs of the pair of journals.
△ Less
Submitted 21 December, 2016; v1 submitted 11 July, 2016;
originally announced July 2016.
-
Breaking of the site-bond percolation universality in networks
Authors:
Filippo Radicchi,
Claudio Castellano
Abstract:
The stochastic addition of either vertices or connections in a network leads to the observation of the percolation transition, a structural change with the appearance of a connected component encompassing a finite fraction of the system. Percolation has always been regarded as a substrate-dependent but model-independent process, in the sense that the critical exponents of the transition are determ…
▽ More
The stochastic addition of either vertices or connections in a network leads to the observation of the percolation transition, a structural change with the appearance of a connected component encompassing a finite fraction of the system. Percolation has always been regarded as a substrate-dependent but model-independent process, in the sense that the critical exponents of the transition are determined by the geometry of the system, but they are identical for the bond and site percolation models. Here, we report a violation of such assumption. We provide analytical and numerical evidence of a difference in the values of the critical exponents between the bond and site percolation models in networks with null percolation thresholds, such as scale-free graphs with diverging second moment of the degree distribution. We discuss possible implications of our results in real networks, and provide additional insights on the anomalous nature of the percolation transition with null threshold.
△ Less
Submitted 22 June, 2016;
originally announced June 2016.
-
Leveraging percolation theory to single out influential spreaders in networks
Authors:
Filippo Radicchi,
Claudio Castellano
Abstract:
Among the consequences of the disordered interaction topology underlying many social, techno- logical and biological systems, a particularly important one is that some nodes, just because of their position in the network, may have a disproportionate effect on dynamical processes mediated by the complex interaction pattern. For example, the early adoption by an opinion leader in a social network ma…
▽ More
Among the consequences of the disordered interaction topology underlying many social, techno- logical and biological systems, a particularly important one is that some nodes, just because of their position in the network, may have a disproportionate effect on dynamical processes mediated by the complex interaction pattern. For example, the early adoption by an opinion leader in a social network may change the fate of a commercial product, or just a few super-spreaders may determine the virality of a meme in social media. Despite many recent efforts, the formulation of an accurate method to optimally identify influential nodes in complex network topologies remains an unsolved challenge. Here, we present the exact solution of the problem for the specific, but highly relevant, case of the Susceptible-Infected-Removed (SIR) model for epidemic spreading at criticality. By exploiting the mapping between bond percolation and the static properties of SIR, we prove that the recently introduced Non-Backtracking centrality is the optimal criterion for the identification of influential spreaders in locally tree-like networks at criticality. By means of simulations on synthetic networks and on a very extensive set of real-world networks, we show that the Non-Backtracking centrality is a highly reliable metric to identify top influential spreaders also in generic graphs not embedded in space, and for noncritical spreading.
△ Less
Submitted 23 May, 2016;
originally announced May 2016.
-
Beyond the locally tree-like approximation for percolation on real networks
Authors:
Filippo Radicchi,
Claudio Castellano
Abstract:
Theoretical attempts proposed so far to describe ordinary percolation processes on real-world networks rely on the locally tree-like ansatz. Such an approximation, however, holds only to a limited extent, as real graphs are often characterized by high frequencies of short loops. We present here a theoretical framework able to overcome such a limitation for the case of site percolation. Our method…
▽ More
Theoretical attempts proposed so far to describe ordinary percolation processes on real-world networks rely on the locally tree-like ansatz. Such an approximation, however, holds only to a limited extent, as real graphs are often characterized by high frequencies of short loops. We present here a theoretical framework able to overcome such a limitation for the case of site percolation. Our method is based on a message passing algorithm that discounts redundant paths along triangles in the graph. We systematically test the approach on 98 real-world graphs and on synthetic networks. We find excellent accuracy in the prediction of the whole percolation diagram, with significant improvement with respect to the prediction obtained under the locally tree-like approximation. Residual discrepancies between theory and simulations do not depend on clustering and can be attributed to the presence of loops longer than three edges. We present also a method to account for clustering in bond percolation, but the improvement with respect to the method based on the tree-like approximation is much less apparent.
△ Less
Submitted 23 February, 2016;
originally announced February 2016.