-
Luck, skill, and depth of competition in games and social hierarchies
Authors:
Maximilian Jerdee,
M. E. J. Newman
Abstract:
Patterns of wins and losses in pairwise contests, such as occur in sports and games, consumer research and paired comparison studies, and human and animal social hierarchies, are commonly analyzed using probabilistic models that allow one to quantify the strength of competitors or predict the outcome of future contests. Here we generalize this approach to incorporate two additional features: an el…
▽ More
Patterns of wins and losses in pairwise contests, such as occur in sports and games, consumer research and paired comparison studies, and human and animal social hierarchies, are commonly analyzed using probabilistic models that allow one to quantify the strength of competitors or predict the outcome of future contests. Here we generalize this approach to incorporate two additional features: an element of randomness or luck that leads to upset wins, and a "depth of competition" variable that measures the complexity of a game or hierarchy. Fitting the resulting model to a large collection of data sets we estimate depth and luck in a range of games, sports, and social situations. In general, we find that social competition tends to be "deep," meaning it has a pronounced hierarchy with many distinct levels, but also that there is often a nonzero chance of an upset victory, meaning that dominance challenges can be won even by significant underdogs. Competition in sports and games, by contrast, tends to be shallow and in most cases there is little evidence of upset wins, beyond those already implied by the shallowness of the hierarchy.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Message passing methods on complex networks
Authors:
M. E. J. Newman
Abstract:
Networks and network computations have become a primary mathematical tool for analyzing the structure of many kinds of complex systems, ranging from the Internet and transportation networks to biochemical interactions and social networks. A common task in network analysis is the calculation of quantities that reside on the nodes of a network, such as centrality measures, probabilities, or model st…
▽ More
Networks and network computations have become a primary mathematical tool for analyzing the structure of many kinds of complex systems, ranging from the Internet and transportation networks to biochemical interactions and social networks. A common task in network analysis is the calculation of quantities that reside on the nodes of a network, such as centrality measures, probabilities, or model states. In this review article we discuss message passing methods, a family of techniques for performing such calculations, based on the propagation of information between the nodes of a network. We introduce the message passing approach with a series of examples, give some illustrative applications and results, and discuss the deep connections between message passing and phase transitions in networks. We also point out some limitations of the message passing approach and describe some recently-introduced methods that address these limitations.
△ Less
Submitted 9 November, 2022;
originally announced November 2022.
-
20 years of network community detection
Authors:
Santo Fortunato,
M. E. J. Newman
Abstract:
A fundamental technical challenge in the analysis of network data is the automated discovery of communities - groups of nodes that are strongly connected or that share similar features or roles. In this commentary we review progress in the field over the last 20 years.
A fundamental technical challenge in the analysis of network data is the automated discovery of communities - groups of nodes that are strongly connected or that share similar features or roles. In this commentary we review progress in the field over the last 20 years.
△ Less
Submitted 2 August, 2022; v1 submitted 29 July, 2022;
originally announced August 2022.
-
Cutting Through the Noise to Infer Autonomous System Topology
Authors:
Kirtus G. Leyba,
Joshua J. Daymude,
Jean-Gabriel Young,
M. E. J. Newman,
Jennifer Rexford,
Stephanie Forrest
Abstract:
The Border Gateway Protocol (BGP) is a distributed protocol that manages interdomain routing without requiring a centralized record of which autonomous systems (ASes) connect to which others. Many methods have been devised to infer the AS topology from publicly available BGP data, but none provide a general way to handle the fact that the data are notoriously incomplete and subject to error. This…
▽ More
The Border Gateway Protocol (BGP) is a distributed protocol that manages interdomain routing without requiring a centralized record of which autonomous systems (ASes) connect to which others. Many methods have been devised to infer the AS topology from publicly available BGP data, but none provide a general way to handle the fact that the data are notoriously incomplete and subject to error. This paper describes a method for reliably inferring AS-level connectivity in the presence of measurement error using Bayesian statistical inference acting on BGP routing tables from multiple vantage points. We employ a novel approach for counting AS adjacency observations in the AS-PATH attribute data from public route collectors, along with a Bayesian algorithm to generate a statistical estimate of the AS-level network. Our approach also gives us a way to evaluate the accuracy of existing reconstruction methods and to identify advantageous locations for new route collectors or vantage points.
△ Less
Submitted 18 January, 2022;
originally announced January 2022.
-
Clustering of heterogeneous populations of networks
Authors:
Jean-Gabriel Young,
Alec Kirkley,
M. E. J. Newman
Abstract:
Statistical methods for reconstructing networks from repeated measurements typically assume that all measurements are generated from the same underlying network structure. This need not be the case, however. People's social networks might be different on weekdays and weekends, for instance. Brain networks may differ between healthy patients and those with dementia or other conditions. Here we desc…
▽ More
Statistical methods for reconstructing networks from repeated measurements typically assume that all measurements are generated from the same underlying network structure. This need not be the case, however. People's social networks might be different on weekdays and weekends, for instance. Brain networks may differ between healthy patients and those with dementia or other conditions. Here we describe a Bayesian analysis framework for such data that allows for the fact that network measurements may be reflective of multiple possible structures. We define a finite mixture model of the measurement process and derive a fast Gibbs sampling procedure that samples exactly from the full posterior distribution of model parameters. The end result is a clustering of the measured networks into groups with similar structure. We demonstrate the method on both real and synthetic network populations.
△ Less
Submitted 23 January, 2022; v1 submitted 15 July, 2021;
originally announced July 2021.
-
The friendship paradox in real and model networks
Authors:
George T. Cantwell,
Alec Kirkley,
M. E. J. Newman
Abstract:
The friendship paradox is the observation that the degrees of the neighbors of a node in any network will, on average, be greater than the degree of the node itself. In common parlance, your friends have more friends than you do. In this paper we develop the mathematical theory of the friendship paradox, both in general as well as for specific model networks, focusing not only on average behavior…
▽ More
The friendship paradox is the observation that the degrees of the neighbors of a node in any network will, on average, be greater than the degree of the node itself. In common parlance, your friends have more friends than you do. In this paper we develop the mathematical theory of the friendship paradox, both in general as well as for specific model networks, focusing not only on average behavior but also on variation about the average and using generating function methods to calculate full distributions of quantities of interest. We compare the predictions of our theory with measurements on a large number of real-world network data sets and find remarkably good agreement. We also develop equivalent theory for the generalized friendship paradox, which compares characteristics of nodes other than degree to those of their neighbors.
△ Less
Submitted 7 December, 2020;
originally announced December 2020.
-
Bayesian inference of network structure from unreliable data
Authors:
Jean-Gabriel Young,
George T. Cantwell,
M. E. J. Newman
Abstract:
Most empirical studies of complex networks do not return direct, error-free measurements of network structure. Instead, they typically rely on indirect measurements that are often error-prone and unreliable. A fundamental problem in empirical network science is how to make the best possible estimates of network structure given such unreliable data. In this paper we describe a fully Bayesian method…
▽ More
Most empirical studies of complex networks do not return direct, error-free measurements of network structure. Instead, they typically rely on indirect measurements that are often error-prone and unreliable. A fundamental problem in empirical network science is how to make the best possible estimates of network structure given such unreliable data. In this paper we describe a fully Bayesian method for reconstructing networks from observational data in any format, even when the data contain substantial measurement error and when the nature and magnitude of that error is unknown. The method is introduced through pedagogical case studies using real-world example networks, and specifically tailored to allow straightforward, computationally efficient implementation with a minimum of technical input. Computer code implementing the method is publicly available.
△ Less
Submitted 9 March, 2021; v1 submitted 7 August, 2020;
originally announced August 2020.
-
Consistency of community structure in complex networks
Authors:
Maria A. Riolo,
M. E. J. Newman
Abstract:
The most widely used techniques for community detection in networks, including methods based on modularity, statistical inference, and information theoretic arguments, all work by optimizing objective functions that measure the quality of network partitions. There is a good case to be made, however, that one should not look solely at the single optimal community structure under such an objective f…
▽ More
The most widely used techniques for community detection in networks, including methods based on modularity, statistical inference, and information theoretic arguments, all work by optimizing objective functions that measure the quality of network partitions. There is a good case to be made, however, that one should not look solely at the single optimal community structure under such an objective function, but rather at a selection of high-scoring structures. If one does this one typically finds that the resulting structures show considerable variation, and this has been taken as evidence that these community detection methods are unreliable, since they do not appear to give consistent answers. Here we argue that, upon closer inspection, the structures found are in fact consistent in a certain way. Specifically, we show that they can all be assembled from a set of underlying "building blocks", groups of network nodes that are usually found together in the same community. Different community structures correspond to different arrangements of blocks, but the blocks themselves are largely invariant. We propose an information theoretic method for discovering the building blocks in specific networks and demonstrate it with several example applications. We conclude that traditional community detection is not the failure some have suggested it is, and that in fact it gives a significant amount of insight into network structure, although perhaps not in exactly the way previously imagined.
△ Less
Submitted 26 August, 2019;
originally announced August 2019.
-
Improved mutual information measure for classification and community detection
Authors:
M. E. J. Newman,
George T. Cantwell,
Jean-Gabriel Young
Abstract:
The information theoretic quantity known as mutual information finds wide use in classification and community detection analyses to compare two classifications of the same set of objects into groups. In the context of classification algorithms, for instance, it is often used to compare discovered classes to known ground truth and hence to quantify algorithm performance. Here we argue that the stan…
▽ More
The information theoretic quantity known as mutual information finds wide use in classification and community detection analyses to compare two classifications of the same set of objects into groups. In the context of classification algorithms, for instance, it is often used to compare discovered classes to known ground truth and hence to quantify algorithm performance. Here we argue that the standard mutual information, as commonly defined, omits a crucial term which can become large under real-world conditions, producing results that can be substantially in error. We demonstrate how to correct this error and define a mutual information that works in all cases. We discuss practical implementation of the new measure and give some example applications.
△ Less
Submitted 29 July, 2019;
originally announced July 2019.
-
Message passing on networks with loops
Authors:
George T. Cantwell,
M. E. J. Newman
Abstract:
In this paper we offer a solution to a long-standing problem in the study of networks. Message passing is a fundamental technique for calculations on networks and graphs. The first versions of the method appeared in the 1930s and over the decades it has been applied to a wide range of foundational problems in mathematics, physics, computer science, statistics, and machine learning, including Bayes…
▽ More
In this paper we offer a solution to a long-standing problem in the study of networks. Message passing is a fundamental technique for calculations on networks and graphs. The first versions of the method appeared in the 1930s and over the decades it has been applied to a wide range of foundational problems in mathematics, physics, computer science, statistics, and machine learning, including Bayesian inference, spin models, coloring, satisfiability, graph partitioning, network epidemiology, and the calculation of matrix eigenvalues. Despite its wide use, however, it has long been recognized that the method has a fundamental flaw: it only works on networks that are free of short loops. Loops introduce correlations that cause the method to give inaccurate answers at best, and to fail completely in the worst cases. Unfortunately, almost all real-world networks contain many short loops, which limits the usefulness of the message passing approach. In this paper we demonstrate how to rectify this shortcoming and create message passing methods that work on any network. We give two example applications, one to the percolation properties of networks and the other to the calculation of the spectra of sparse matrices.
△ Less
Submitted 18 July, 2019;
originally announced July 2019.
-
Spectra of networks containing short loops
Authors:
M. E. J. Newman
Abstract:
The spectrum of the adjacency matrix plays several important roles in the mathematical theory of networks and in network data analysis, for example in percolation theory, community detection, centrality measures, and the theory of dynamical systems on networks. A number of methods have been developed for the analytic computation of network spectra, but they typically assume that networks are local…
▽ More
The spectrum of the adjacency matrix plays several important roles in the mathematical theory of networks and in network data analysis, for example in percolation theory, community detection, centrality measures, and the theory of dynamical systems on networks. A number of methods have been developed for the analytic computation of network spectra, but they typically assume that networks are locally tree-like, meaning that the local neighborhood of any node takes the form of a tree, free of short loops. Empirically observed networks, by contrast, often have many short loops. Here we develop an approach for calculating the spectra of networks with short loops using a message passing method. We give example applications to some previously studied classes of networks.
△ Less
Submitted 12 February, 2019;
originally announced February 2019.
-
Spectra of random networks with arbitrary degrees
Authors:
M. E. J. Newman,
Xiao Zhang,
Raj Rao Nadakuditi
Abstract:
We derive a message passing method for computing the spectra of locally tree-like networks and an approximation to it that allows us to compute closed-form expressions or fast numerical approximates for the spectral density of random graphs with arbitrary node degrees -- the so-called configuration model. We find the latter approximation to work well for all but the sparsest of networks. We also d…
▽ More
We derive a message passing method for computing the spectra of locally tree-like networks and an approximation to it that allows us to compute closed-form expressions or fast numerical approximates for the spectral density of random graphs with arbitrary node degrees -- the so-called configuration model. We find the latter approximation to work well for all but the sparsest of networks. We also derive bounds on the position of the band edges of the spectrum, which are important for identifying structural phase transitions in networks.
△ Less
Submitted 7 January, 2019;
originally announced January 2019.
-
Mixing patterns and individual differences in networks
Authors:
George T. Cantwell,
M. E. J. Newman
Abstract:
We study mixing patterns in networks, meaning the propensity for nodes of different kinds to connect to one another. The phenomenon of assortative mixing, whereby nodes prefer to connect to others that are similar to themselves, has been widely studied, but here we go further and examine how and to what extent nodes that are otherwise similar can have different preferences. Many individuals in a f…
▽ More
We study mixing patterns in networks, meaning the propensity for nodes of different kinds to connect to one another. The phenomenon of assortative mixing, whereby nodes prefer to connect to others that are similar to themselves, has been widely studied, but here we go further and examine how and to what extent nodes that are otherwise similar can have different preferences. Many individuals in a friendship network, for instance, may prefer friends who are roughly the same age as themselves, but some may display a preference for older or younger friends. We introduce a network model that captures this behavior and a method for fitting it to empirical network data. We propose metrics to characterize the mean and variation of mixing patterns and show how to infer their values from the fitted model, either using maximum-likelihood estimates of model parameters or in a Bayesian framework that does not require fixing any parameters.
△ Less
Submitted 17 April, 2019; v1 submitted 2 October, 2018;
originally announced October 2018.
-
Balance in signed networks
Authors:
Alec Kirkley,
George T. Cantwell,
M. E. J. Newman
Abstract:
We consider signed networks in which connections or edges can be either positive (friendship, trust, alliance) or negative (dislike, distrust, conflict). Early literature in graph theory theorized that such networks should display "structural balance," meaning that certain configurations of positive and negative edges are favored and others are disfavored. Here we propose two measures of balance i…
▽ More
We consider signed networks in which connections or edges can be either positive (friendship, trust, alliance) or negative (dislike, distrust, conflict). Early literature in graph theory theorized that such networks should display "structural balance," meaning that certain configurations of positive and negative edges are favored and others are disfavored. Here we propose two measures of balance in signed networks based on the established notions of weak and strong balance, and compare their performance on a range of tasks with each other and with previously proposed measures. In particular, we ask whether real-world signed networks are significantly balanced by these measures compared to an appropriate null model, finding that indeed they are, by all the measures studied. We also test our ability to predict unknown signs in otherwise known networks by maximizing balance. In a series of cross-validation tests we find that our measures are able to predict signs substantially better than chance.
△ Less
Submitted 13 September, 2018;
originally announced September 2018.
-
Estimating network structure from unreliable measurements
Authors:
M. E. J. Newman
Abstract:
Most empirical studies of networks assume that the network data we are given represent a complete and accurate picture of the nodes and edges in the system of interest, but in real-world situations this is rarely the case. More often the data only specify the network structure imperfectly -- like data in essentially every other area of empirical science, network data are prone to measurement error…
▽ More
Most empirical studies of networks assume that the network data we are given represent a complete and accurate picture of the nodes and edges in the system of interest, but in real-world situations this is rarely the case. More often the data only specify the network structure imperfectly -- like data in essentially every other area of empirical science, network data are prone to measurement error and noise. At the same time, the data may be richer than simple network measurements, incorporating multiple measurements, weights, lengths or strengths of edges, node or edge labels, or annotations of various kinds. Here we develop a general method for making estimates of network structure and properties using any form of network data, simple or complex, when the data are unreliable, and give example applications to a selection of social and biological networks.
△ Less
Submitted 18 December, 2018; v1 submitted 6 March, 2018;
originally announced March 2018.
-
Efficient method for estimating the number of communities in a network
Authors:
Maria A. Riolo,
George T. Cantwell,
Gesine Reinert,
M. E. J. Newman
Abstract:
While there exist a wide range of effective methods for community detection in networks, most of them require one to know in advance how many communities one is looking for. Here we present a method for estimating the number of communities in a network using a combination of Bayesian inference with a novel prior and an efficient Monte Carlo sampling scheme. We test the method extensively on both r…
▽ More
While there exist a wide range of effective methods for community detection in networks, most of them require one to know in advance how many communities one is looking for. Here we present a method for estimating the number of communities in a network using a combination of Bayesian inference with a novel prior and an efficient Monte Carlo sampling scheme. We test the method extensively on both real and computer-generated networks, showing that it performs accurately and consistently, even in cases where groups are widely varying in size or structure.
△ Less
Submitted 7 June, 2017;
originally announced June 2017.
-
Network structure from rich but noisy data
Authors:
M. E. J. Newman
Abstract:
Driven by growing interest in the sciences, industry, and among the broader public, a large number of empirical studies have been conducted in recent years of the structure of networks ranging from the internet and the world wide web to biological networks and social networks. The data produced by these experiments are often rich and multimodal, yet at the same time they may contain substantial me…
▽ More
Driven by growing interest in the sciences, industry, and among the broader public, a large number of empirical studies have been conducted in recent years of the structure of networks ranging from the internet and the world wide web to biological networks and social networks. The data produced by these experiments are often rich and multimodal, yet at the same time they may contain substantial measurement error. In practice, this means that the true network structure can differ greatly from naive estimates made from the raw data, and hence that conclusions drawn from those naive estimates may be significantly in error. In this paper we describe a technique that circumvents this problem and allows us to make optimal estimates of the true structure of networks in the presence of both richly textured data and significant measurement uncertainty. We give example applications to two different social networks, one derived from face-to-face interactions and one from self-reported friendships.
△ Less
Submitted 6 February, 2018; v1 submitted 21 March, 2017;
originally announced March 2017.
-
Random graph models for dynamic networks
Authors:
Xiao Zhang,
Cristopher Moore,
M. E. J. Newman
Abstract:
We propose generalizations of a number of standard network models, including the classic random graph, the configuration model, and the stochastic block model, to the case of time-varying networks. We assume that the presence and absence of edges are governed by continuous-time Markov processes with rate parameters that can depend on properties of the nodes. In addition to computing equilibrium pr…
▽ More
We propose generalizations of a number of standard network models, including the classic random graph, the configuration model, and the stochastic block model, to the case of time-varying networks. We assume that the presence and absence of edges are governed by continuous-time Markov processes with rate parameters that can depend on properties of the nodes. In addition to computing equilibrium properties of these models, we demonstrate their use in data analysis and statistical inference, giving efficient algorithms for fitting them to observed network data. This allows us, for instance, to estimate the time constants of network evolution or infer community structure from temporal network data using cues embedded both in the probabilities over time that node pairs are connected by edges and in the characteristic dynamics of edge appearance and disappearance. We illustrate our methods with a selection of applications, both to computer-generated test networks and real-world examples.
△ Less
Submitted 26 July, 2016;
originally announced July 2016.
-
Community detection in networks: Modularity optimization and maximum likelihood are equivalent
Authors:
M. E. J. Newman
Abstract:
We demonstrate an exact equivalence between two widely used methods of community detection in networks, the method of modularity maximization in its generalized form which incorporates a resolution parameter controlling the size of the communities discovered, and the method of maximum likelihood applied to the special case of the stochastic block model known as the planted partition model, in whic…
▽ More
We demonstrate an exact equivalence between two widely used methods of community detection in networks, the method of modularity maximization in its generalized form which incorporates a resolution parameter controlling the size of the communities discovered, and the method of maximum likelihood applied to the special case of the stochastic block model known as the planted partition model, in which all communities in a network are assumed to have statistically similar properties. Among other things, this equivalence provides a mathematically principled derivation of the modularity function, clarifies the conditions and assumptions of its use, and gives an explicit formula for the optimal value of the resolution parameter.
△ Less
Submitted 7 June, 2016;
originally announced June 2016.
-
Estimating the number of communities in a network
Authors:
M. E. J. Newman,
Gesine Reinert
Abstract:
Community detection, the division of a network into dense subnetworks with only sparse connections between them, has been a topic of vigorous study in recent years. However, while there exist a range of powerful and flexible methods for dividing a network into a specified number of communities, it is an open question how to determine exactly how many communities one should use. Here we describe a…
▽ More
Community detection, the division of a network into dense subnetworks with only sparse connections between them, has been a topic of vigorous study in recent years. However, while there exist a range of powerful and flexible methods for dividing a network into a specified number of communities, it is an open question how to determine exactly how many communities one should use. Here we describe a mathematically principled approach for finding the number of communities in a network using a maximum-likelihood method. We demonstrate the approach on a range of real-world examples with known community structure, finding that it is able to determine the number of communities correctly in every case.
△ Less
Submitted 23 August, 2016; v1 submitted 9 May, 2016;
originally announced May 2016.
-
Community detection in networks with unequal groups
Authors:
Pan Zhang,
Cristopher Moore,
M. E. J. Newman
Abstract:
Recently, a phase transition has been discovered in the network community detection problem below which no algorithm can tell which nodes belong to which communities with success any better than a random guess. This result has, however, so far been limited to the case where the communities have the same size or the same average degree. Here we consider the case where the sizes or average degrees a…
▽ More
Recently, a phase transition has been discovered in the network community detection problem below which no algorithm can tell which nodes belong to which communities with success any better than a random guess. This result has, however, so far been limited to the case where the communities have the same size or the same average degree. Here we consider the case where the sizes or average degrees are different. This asymmetry allows us to assign nodes to communities with better-than- random success by examining their local neighborhoods. Using the cavity method, we show that this removes the detectability transition completely for networks with four groups or fewer, while for more than four groups the transition persists up to a critical amount of asymmetry but not beyond. The critical point in the latter case coincides with the point at which local information percolates, causing a global transition from a less-accurate solution to a more-accurate one.
△ Less
Submitted 10 September, 2015; v1 submitted 31 August, 2015;
originally announced September 2015.
-
Multiway spectral community detection in networks
Authors:
Xiao Zhang,
M. E. J. Newman
Abstract:
One of the most widely used methods for community detection in networks is the maximization of the quality function known as modularity. Of the many maximization techniques that have been used in this context, some of the most conceptually attractive are the spectral methods, which are based on the eigenvectors of the modularity matrix. Spectral algorithms have, however, been limited by and large…
▽ More
One of the most widely used methods for community detection in networks is the maximization of the quality function known as modularity. Of the many maximization techniques that have been used in this context, some of the most conceptually attractive are the spectral methods, which are based on the eigenvectors of the modularity matrix. Spectral algorithms have, however, been limited by and large to the division of networks into only two or three communities, with divisions into more than three being achieved by repeated two-way division. Here we present a spectral algorithm that can directly divide a network into any number of communities. The algorithm makes use of a mapping from modularity maximization to a vector partitioning problem, combined with a fast heuristic for vector partitioning. We compare the performance of this spectral algorithm with previous approaches and find it to give superior results, particularly in cases where community sizes are unbalanced. We also give demonstrative applications of the algorithm to two real-world networks and find that it produces results in good agreement with expectations for the networks studied.
△ Less
Submitted 22 June, 2015;
originally announced July 2015.
-
Structure and inference in annotated networks
Authors:
M. E. J. Newman,
Aaron Clauset
Abstract:
For many networks of scientific interest we know both the connections of the network and information about the network nodes, such as the age or gender of individuals in a social network, geographic location of nodes in the Internet, or cellular function of nodes in a gene regulatory network. Here we demonstrate how this "metadata" can be used to improve our analysis and understanding of network s…
▽ More
For many networks of scientific interest we know both the connections of the network and information about the network nodes, such as the age or gender of individuals in a social network, geographic location of nodes in the Internet, or cellular function of nodes in a gene regulatory network. Here we demonstrate how this "metadata" can be used to improve our analysis and understanding of network structure. We focus in particular on the problem of community detection in networks and develop a mathematically principled approach that combines a network and its metadata to detect communities more accurately than can be done with either alone. Crucially, the method does not assume that the metadata are correlated with the communities we are trying to find. Instead the method learns whether a correlation exists and correctly uses or ignores the metadata depending on whether they contain useful information. The learned correlations are also of interest in their own right, allowing us to make predictions about the community membership of nodes whose network connections are unknown. We demonstrate our method on synthetic networks with known structure and on real-world networks, large and small, drawn from social, biological, and technological domains.
△ Less
Submitted 14 July, 2015;
originally announced July 2015.
-
Structural inference for uncertain networks
Authors:
Travis Martin,
Brian Ball,
M. E. J. Newman
Abstract:
In the study of networked systems such as biological, technological, and social networks the available data are often uncertain. Rather than knowing the structure of a network exactly, we know the connections between nodes only with a certain probability. In this paper we develop methods for the analysis of such uncertain data, focusing particularly on the problem of community detection. We give a…
▽ More
In the study of networked systems such as biological, technological, and social networks the available data are often uncertain. Rather than knowing the structure of a network exactly, we know the connections between nodes only with a certain probability. In this paper we develop methods for the analysis of such uncertain data, focusing particularly on the problem of community detection. We give a principled maximum-likelihood method for inferring community structure and demonstrate how the results can be used to make improved estimates of the true structure of the network. Using computer-generated benchmark networks we demonstrate that our methods are able to reconstruct known communities more accurately than previous approaches based on data thresholding. We also give an example application to the detection of communities in a protein-protein interaction network.
△ Less
Submitted 17 June, 2015;
originally announced June 2015.
-
Generalized communities in networks
Authors:
M. E. J. Newman,
Tiago P. Peixoto
Abstract:
A substantial volume of research has been devoted to studies of community structure in networks, but communities are not the only possible form of large-scale network structure. Here we describe a broad extension of community structure that encompasses traditional communities but includes a wide range of generalized structural patterns as well. We describe a principled method for detecting this ge…
▽ More
A substantial volume of research has been devoted to studies of community structure in networks, but communities are not the only possible form of large-scale network structure. Here we describe a broad extension of community structure that encompasses traditional communities but includes a wide range of generalized structural patterns as well. We describe a principled method for detecting this generalized structure in empirical network data and demonstrate with real-world examples how it can be used to learn new things about the shape and meaning of networks.
△ Less
Submitted 27 May, 2015;
originally announced May 2015.
-
Identification of core-periphery structure in networks
Authors:
Xiao Zhang,
Travis Martin,
M. E. J. Newman
Abstract:
Many networks can be usefully decomposed into a dense core plus an outlying, loosely-connected periphery. Here we propose an algorithm for performing such a decomposition on empirical network data using methods of statistical inference. Our method fits a generative model of core-periphery structure to observed data using a combination of an expectation--maximization algorithm for calculati…
▽ More
Many networks can be usefully decomposed into a dense core plus an outlying, loosely-connected periphery. Here we propose an algorithm for performing such a decomposition on empirical network data using methods of statistical inference. Our method fits a generative model of core-periphery structure to observed data using a combination of an expectation--maximization algorithm for calculating the parameters of the model and a belief propagation algorithm for calculating the decomposition itself. We find the method to be efficient, scaling easily to networks with a million or more nodes and we test it on a range of networks, including real-world examples as well as computer-generated benchmarks, for which it successfully identifies known core-periphery structure with low error rate. We also demonstrate that the method is immune from the detectability transition observed in the related community detection problem, which prevents the detection of community structure when that structure is too weak. There is no such transition for core-periphery structure, which is detectable, albeit with some statistical error, no matter how weak it is.
△ Less
Submitted 16 September, 2014;
originally announced September 2014.
-
Equitable random graphs
Authors:
M. E. J. Newman,
Travis Martin
Abstract:
Random graph models have played a dominant role in the theoretical study of networked systems. The Poisson random graph of Erdos and Renyi, in particular, as well as the so-called configuration model, have served as the starting point for numerous calculations. In this paper we describe another large class of random graph models, which we call equitable random graphs and which are flexible enough…
▽ More
Random graph models have played a dominant role in the theoretical study of networked systems. The Poisson random graph of Erdos and Renyi, in particular, as well as the so-called configuration model, have served as the starting point for numerous calculations. In this paper we describe another large class of random graph models, which we call equitable random graphs and which are flexible enough to represent networks with diverse degree distributions and many nontrivial types of structure, including community structure, bipartite structure, degree correlations, stratification, and others, yet are exactly solvable for a wide range of properties in the limit of large graph size, including percolation properties, complete spectral density, and the behavior of homogeneous dynamical systems, such as coupled oscillators or epidemic models.
△ Less
Submitted 6 May, 2014;
originally announced May 2014.
-
Percolation on sparse networks
Authors:
Brian Karrer,
M. E. J. Newman,
Lenka Zdeborová
Abstract:
We study percolation on networks, which is used as a model of the resilience of networked systems such as the Internet to attack or failure and as a simple model of the spread of disease over human contact networks. We reformulate percolation as a message passing process and demonstrate how the resulting equations can be used to calculate, among other things, the size of the percolating cluster an…
▽ More
We study percolation on networks, which is used as a model of the resilience of networked systems such as the Internet to attack or failure and as a simple model of the spread of disease over human contact networks. We reformulate percolation as a message passing process and demonstrate how the resulting equations can be used to calculate, among other things, the size of the percolating cluster and the average cluster size. The calculations are exact for sparse networks when the number of short loops in the network is small, but even on networks with many short loops we find them to be highly accurate when compared with direct numerical simulations. By considering the fixed points of the message passing process, we also show that the percolation threshold on a network with few loops is given by the inverse of the leading eigenvalue of the so-called non-backtracking matrix.
△ Less
Submitted 7 October, 2014; v1 submitted 2 May, 2014;
originally announced May 2014.
-
Localization and centrality in networks
Authors:
Travis Martin,
Xiao Zhang,
M. E. J. Newman
Abstract:
Eigenvector centrality is a common measure of the importance of nodes in a network. Here we show that under common conditions the eigenvector centrality displays a localization transition that causes most of the weight of the centrality to concentrate on a small number of nodes in the network. In this regime the measure is no longer useful for distinguishing among the remaining nodes and its effic…
▽ More
Eigenvector centrality is a common measure of the importance of nodes in a network. Here we show that under common conditions the eigenvector centrality displays a localization transition that causes most of the weight of the centrality to concentrate on a small number of nodes in the network. In this regime the measure is no longer useful for distinguishing among the remaining nodes and its efficacy as a network metric is impaired. As a remedy, we propose an alternative centrality measure based on the nonbacktracking matrix, which gives results closely similar to the standard eigenvector centrality in dense networks where the latter is well behaved, but avoids localization and gives useful results in regimes where the standard centrality fails.
△ Less
Submitted 3 January, 2015; v1 submitted 20 January, 2014;
originally announced January 2014.
-
Prediction of highly cited papers
Authors:
M. E. J. Newman
Abstract:
In an article written five years ago [arXiv:0809.0522], we described a method for predicting which scientific papers will be highly cited in the future, even if they are currently not highly cited. Applying the method to real citation data we made predictions about papers we believed would end up being well cited. Here we revisit those predictions, five years on, to see how well we did. Among the…
▽ More
In an article written five years ago [arXiv:0809.0522], we described a method for predicting which scientific papers will be highly cited in the future, even if they are currently not highly cited. Applying the method to real citation data we made predictions about papers we believed would end up being well cited. Here we revisit those predictions, five years on, to see how well we did. Among the over 2000 papers in our original data set, we examine the fifty that, by the measures of our previous study, were predicted to do best and we find that they have indeed received substantially more citations in the intervening years than other papers, even after controlling for the number of prior citations. On average these top fifty papers have received 23 times as many citations in the last five years as the average paper in the data set as a whole, and 15 times as many as the average paper in a randomly drawn control group that started out with the same number of citations. Applying our prediction technique to current data, we also make new predictions of papers that we believe will be well cited in the next few years.
△ Less
Submitted 30 October, 2013;
originally announced October 2013.
-
The small-world effect is a modern phenomenon
Authors:
Seth A. Marvel,
Travis Martin,
Charles R. Doering,
David Lusseau,
M. E. J. Newman
Abstract:
The "small-world effect" is the observation that one can find a short chain of acquaintances, often of no more than a handful of individuals, connecting almost any two people on the planet. It is often expressed in the language of networks, where it is equivalent to the statement that most pairs of individuals are connected by a short path through the acquaintance network. Although the small-world…
▽ More
The "small-world effect" is the observation that one can find a short chain of acquaintances, often of no more than a handful of individuals, connecting almost any two people on the planet. It is often expressed in the language of networks, where it is equivalent to the statement that most pairs of individuals are connected by a short path through the acquaintance network. Although the small-world effect is well-established empirically for contemporary social networks, we argue here that it is a relatively recent phenomenon, arising only in the last few hundred years: for most of mankind's tenure on Earth the social world was large, with most pairs of individuals connected by relatively long chains of acquaintances, if at all. Our conclusions are based on observations about the spread of diseases, which travel over contact networks between individuals and whose dynamics can give us clues to the structure of those networks even when direct network measurements are not available. As an example we consider the spread of the Black Death in 14th-century Europe, which is known to have traveled across the continent in well-defined waves of infection over the course of several years. Using established epidemiological models, we show that such wave-like behavior can occur only if contacts between individuals living far apart are exponentially rare. We further show that if long-distance contacts are exponentially rare, then the shortest chain of contacts between distant individuals is on average a long one. The observation of the wave-like spread of a disease like the Black Death thus implies a network without the small-world effect.
△ Less
Submitted 9 October, 2013;
originally announced October 2013.
-
Spectra of random graphs with community structure and arbitrary degrees
Authors:
Xiao Zhang,
Raj Rao Nadakuditi,
M. E. J. Newman
Abstract:
Using methods from random matrix theory researchers have recently calculated the full spectra of random networks with arbitrary degrees and with community structure. Both reveal interesting spectral features, including deviations from the Wigner semicircle distribution and phase transitions in the spectra of community structured networks. In this paper we generalize both calculations, giving a pre…
▽ More
Using methods from random matrix theory researchers have recently calculated the full spectra of random networks with arbitrary degrees and with community structure. Both reveal interesting spectral features, including deviations from the Wigner semicircle distribution and phase transitions in the spectra of community structured networks. In this paper we generalize both calculations, giving a prescription for calculating the spectrum of a network with both community structure and an arbitrary degree distribution. In general the spectrum has two parts, a continuous spectral band, which can depart strongly from the classic semicircle form, and a set of outlying eigenvalues that indicate the presence of communities.
△ Less
Submitted 30 September, 2013;
originally announced October 2013.
-
Spectral community detection in sparse networks
Authors:
M. E. J. Newman
Abstract:
Spectral methods based on the eigenvectors of matrices are widely used in the analysis of network data, particularly for community detection and graph partitioning. Standard methods based on the adjacency matrix and related matrices, however, break down for very sparse networks, which includes many networks of practical interest. As a solution to this problem it has been recently proposed that we…
▽ More
Spectral methods based on the eigenvectors of matrices are widely used in the analysis of network data, particularly for community detection and graph partitioning. Standard methods based on the adjacency matrix and related matrices, however, break down for very sparse networks, which includes many networks of practical interest. As a solution to this problem it has been recently proposed that we focus instead on the spectrum of the non-backtracking matrix, an alternative matrix representation of a network that shows better behavior in the sparse limit. Inspired by this suggestion, we here make use of a relaxation method to derive a spectral community detection algorithm that works well even in the sparse regime where other methods break down. Interestingly, however, the matrix at the heart of the method, it turns out, is not exactly the non-backtracking matrix, but a variant of it with a somewhat different definition. We study the behavior of this variant matrix for both artificial and real-world networks and find it to have desirable properties, especially in the common case of networks with broad degree distributions, for which it appears to have a better behaved spectrum and eigenvectors than the original non-backtracking matrix.
△ Less
Submitted 29 August, 2013;
originally announced August 2013.
-
Spectral methods for network community detection and graph partitioning
Authors:
M. E. J. Newman
Abstract:
We consider three distinct and well studied problems concerning network structure: community detection by modularity maximization, community detection by statistical inference, and normalized-cut graph partitioning. Each of these problems can be tackled using spectral algorithms that make use of the eigenvectors of matrix representations of the network. We show that with certain choices of the fre…
▽ More
We consider three distinct and well studied problems concerning network structure: community detection by modularity maximization, community detection by statistical inference, and normalized-cut graph partitioning. Each of these problems can be tackled using spectral algorithms that make use of the eigenvectors of matrix representations of the network. We show that with certain choices of the free parameters appearing in these spectral algorithms the algorithms for all three problems are, in fact, identical, and hence that, at least within the spectral approximations used here, there is no difference between the modularity- and inference-based community detection methods, or between either and graph partitioning.
△ Less
Submitted 29 July, 2013;
originally announced July 2013.
-
Community detection and graph partitioning
Authors:
M. E. J. Newman
Abstract:
Many methods have been proposed for community detection in networks. Some of the most promising are methods based on statistical inference, which rest on solid mathematical foundations and return excellent results in practice. In this paper we show that two of the most widely used inference methods can be mapped directly onto versions of the standard minimum-cut graph partitioning problem, which a…
▽ More
Many methods have been proposed for community detection in networks. Some of the most promising are methods based on statistical inference, which rest on solid mathematical foundations and return excellent results in practice. In this paper we show that two of the most widely used inference methods can be mapped directly onto versions of the standard minimum-cut graph partitioning problem, which allows us to apply any of the many well-understood partitioning algorithms to the solution of community detection problems. We illustrate the approach by adapting the Laplacian spectral partitioning method to perform community inference, testing the resulting algorithm on a range of examples, including computer-generated and real-world networks. Both the quality of the results and the running time rival the best previous methods.
△ Less
Submitted 21 May, 2013;
originally announced May 2013.
-
Interacting epidemics and coinfection on contact networks
Authors:
M. E. J. Newman,
C. R. Ferrario
Abstract:
The spread of certain diseases can be promoted, in some cases substantially, by prior infection with another disease. One example is that of HIV, whose immunosuppressant effects significantly increase the chances of infection with other pathogens. Such coinfection processes, when combined with nontrivial structure in the contact networks over which diseases spread, can lead to complex patterns of…
▽ More
The spread of certain diseases can be promoted, in some cases substantially, by prior infection with another disease. One example is that of HIV, whose immunosuppressant effects significantly increase the chances of infection with other pathogens. Such coinfection processes, when combined with nontrivial structure in the contact networks over which diseases spread, can lead to complex patterns of epidemiological behavior. Here we consider a mathematical model of two diseases spreading through a single population, where infection with one disease is dependent on prior infection with the other. We solve exactly for the sizes of the outbreaks of both diseases in the limit of large population size, along with the complete phase diagram of the system. Among other things, we use our model to demonstrate how diseases can be controlled not only by reducing the rate of their spread, but also by reducing the spread of other infections upon which they depend.
△ Less
Submitted 20 May, 2013;
originally announced May 2013.
-
Coauthorship and citation in scientific publishing
Authors:
Travis Martin,
Brian Ball,
Brian Karrer,
M. E. J. Newman
Abstract:
A large number of published studies have examined the properties of either networks of citation among scientific papers or networks of coauthorship among scientists. Here, using an extensive data set covering more than a century of physics papers published in the Physical Review, we study a hybrid coauthorship/citation network that combines the two, which we analyze to gain insight into the correl…
▽ More
A large number of published studies have examined the properties of either networks of citation among scientific papers or networks of coauthorship among scientists. Here, using an extensive data set covering more than a century of physics papers published in the Physical Review, we study a hybrid coauthorship/citation network that combines the two, which we analyze to gain insight into the correlations and interactions between authorship and citation. Among other things, we investigate the extent to which individuals tend to cite themselves or their collaborators more than others, the extent to which they cite themselves or their collaborators more quickly after publication, and the extent to which they tend to return the favor of a citation from another scientist.
△ Less
Submitted 1 April, 2013;
originally announced April 2013.
-
Spectra of random graphs with arbitrary expected degrees
Authors:
Raj Rao Nadakuditi,
M. E. J. Newman
Abstract:
We study random graphs with arbitrary distributions of expected degree and derive expressions for the spectra of their adjacency and modularity matrices. We give a complete prescription for calculating the spectra that is exact in the limit of large network size and large vertex degrees. We also study the effect on the spectra of hubs in the network, vertices of unusually high degree, and show tha…
▽ More
We study random graphs with arbitrary distributions of expected degree and derive expressions for the spectra of their adjacency and modularity matrices. We give a complete prescription for calculating the spectra that is exact in the limit of large network size and large vertex degrees. We also study the effect on the spectra of hubs in the network, vertices of unusually high degree, and show that these produce isolated eigenvalues outside the main spectral band, akin to impurity states in condensed matter systems, with accompanying eigenvectors that are strongly localized around the hubs. We also give numerical results that confirm our analytic expressions.
△ Less
Submitted 6 August, 2012;
originally announced August 2012.
-
Friendship networks and social status
Authors:
Brian Ball,
M. E. J. Newman
Abstract:
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junio…
▽ More
In empirical studies of friendship networks participants are typically asked, in interviews or questionnaires, to identify some or all of their close friends, resulting in a directed network in which friendships can, and often do, run in only one direction between a pair of individuals. Here we analyze a large collection of such networks representing friendships among students at US high and junior-high schools and show that the pattern of unreciprocated friendships is far from random. In every network, without exception, we find that there exists a ranking of participants, from low to high, such that almost all unreciprocated friendships consist of a lower-ranked individual claiming friendship with a higher-ranked one. We present a maximum-likelihood method for deducing such rankings from observed network data and conjecture that the rankings produced reflect a measure of social status. We note in particular that reciprocated and unreciprocated friendships obey different statistics, suggesting different formation processes, and that rankings are correlated with other characteristics of the participants that are traditionally associated with status, such as age and overall popularity as measured by total number of friends.
△ Less
Submitted 30 May, 2012;
originally announced May 2012.
-
Graph spectra and the detectability of community structure in networks
Authors:
Raj Rao Nadakuditi,
M. E. J. Newman
Abstract:
We study networks that display community structure -- groups of nodes within which connections are unusually dense. Using methods from random matrix theory, we calculate the spectra of such networks in the limit of large size, and hence demonstrate the presence of a phase transition in matrix methods for community detection, such as the popular modularity maximization method. The transition separa…
▽ More
We study networks that display community structure -- groups of nodes within which connections are unusually dense. Using methods from random matrix theory, we calculate the spectra of such networks in the limit of large size, and hence demonstrate the presence of a phase transition in matrix methods for community detection, such as the popular modularity maximization method. The transition separates a regime in which such methods successfully detect the community structure from one in which the structure is present but is not detected. By comparing these results with recent analyses of maximum-likelihood methods we are able to show that spectral modularity maximization is an optimal detection method in the sense that no other method will succeed in the regime where the modularity method fails.
△ Less
Submitted 8 May, 2012;
originally announced May 2012.
-
Complex Systems: A Survey
Authors:
M. E. J. Newman
Abstract:
A complex system is a system composed of many interacting parts, often called agents, which displays collective behavior that does not follow trivially from the behaviors of the individual parts. Examples include condensed matter systems, ecosystems, stock markets and economies, biological evolution, and indeed the whole of human society. Substantial progress has been made in the quantitative unde…
▽ More
A complex system is a system composed of many interacting parts, often called agents, which displays collective behavior that does not follow trivially from the behaviors of the individual parts. Examples include condensed matter systems, ecosystems, stock markets and economies, biological evolution, and indeed the whole of human society. Substantial progress has been made in the quantitative understanding of complex systems, particularly since the 1980s, using a combination of basic theory, much of it derived from physics, and computer simulation. The subject is a broad one, drawing on techniques and ideas from a wide range of areas. Here I give a survey of the main themes and methods of complex systems science and an annotated bibliography of resources, ranging from classic papers to recent books and reviews.
△ Less
Submitted 6 December, 2011;
originally announced December 2011.
-
Competing epidemics on complex networks
Authors:
Brian Karrer,
M. E. J. Newman
Abstract:
Human diseases spread over networks of contacts between individuals and a substantial body of recent research has focused on the dynamics of the spreading process. Here we examine a model of two competing diseases spreading over the same network at the same time, where infection with either disease gives an individual subsequent immunity to both. Using a combination of analytic and numerical metho…
▽ More
Human diseases spread over networks of contacts between individuals and a substantial body of recent research has focused on the dynamics of the spreading process. Here we examine a model of two competing diseases spreading over the same network at the same time, where infection with either disease gives an individual subsequent immunity to both. Using a combination of analytic and numerical methods, we derive the phase diagram of the system and estimates of the expected final numbers of individuals infected with each disease. The system shows an unusual dynamical transition between dominance of one disease and dominance of the other as a function of their relative rates of growth. Close to this transition the final outcomes show strong dependence on stochastic fluctuations in the early stages of growth, dependence that decreases with increasing network size, but does so sufficiently slowly as still to be easily visible in systems with millions or billions of individuals. In most regions of the phase diagram we find that one disease eventually dominates while the other reaches only a vanishing fraction of the network, but the system also displays a significant coexistence regime in which both diseases reach epidemic proportions and infect an extensive fraction of the network.
△ Less
Submitted 17 May, 2011;
originally announced May 2011.
-
An efficient and principled method for detecting communities in networks
Authors:
Brian Ball,
Brian Karrer,
M. E. J. Newman
Abstract:
A fundamental problem in the analysis of network data is the detection of network communities, groups of densely interconnected nodes, which may be overlapping or disjoint. Here we describe a method for finding overlapping communities based on a principled statistical approach using generative network models. We show how the method can be implemented using a fast, closed-form expectation-maximizat…
▽ More
A fundamental problem in the analysis of network data is the detection of network communities, groups of densely interconnected nodes, which may be overlapping or disjoint. Here we describe a method for finding overlapping communities based on a principled statistical approach using generative network models. We show how the method can be implemented using a fast, closed-form expectation-maximization algorithm that allows us to analyze networks of millions of nodes in reasonable running times. We test the method both on real-world networks and on synthetic benchmarks and find that it gives results competitive with previous methods. We also show that the same approach can be used to extract nonoverlapping community divisions via a relaxation method, and demonstrate that the algorithm is competitively fast and accurate for the nonoverlapping problem.
△ Less
Submitted 18 April, 2011;
originally announced April 2011.
-
Stochastic blockmodels and community structure in networks
Authors:
Brian Karrer,
M. E. J. Newman
Abstract:
Stochastic blockmodels have been proposed as a tool for detecting community structure in networks as well as for generating synthetic networks for use as benchmarks. Most blockmodels, however, ignore variation in vertex degree, making them unsuitable for applications to real-world networks, which typically display broad degree distributions that can significantly distort the results. Here we demon…
▽ More
Stochastic blockmodels have been proposed as a tool for detecting community structure in networks as well as for generating synthetic networks for use as benchmarks. Most blockmodels, however, ignore variation in vertex degree, making them unsuitable for applications to real-world networks, which typically display broad degree distributions that can significantly distort the results. Here we demonstrate how the generalization of blockmodels to incorporate this missing element leads to an improved objective function for community detection in complex networks. We also propose a heuristic algorithm for community detection using this objective function or its non-degree-corrected counterpart and show that the degree-corrected version dramatically outperforms the uncorrected one in both real-world and synthetic networks.
△ Less
Submitted 23 August, 2010;
originally announced August 2010.
-
Random graphs containing arbitrary distributions of subgraphs
Authors:
Brian Karrer,
M. E. J. Newman
Abstract:
Traditional random graph models of networks generate networks that are locally tree-like, meaning that all local neighborhoods take the form of trees. In this respect such models are highly unrealistic, most real networks having strongly non-tree-like neighborhoods that contain short loops, cliques, or other biconnected subgraphs. In this paper we propose and analyze a new class of random graph…
▽ More
Traditional random graph models of networks generate networks that are locally tree-like, meaning that all local neighborhoods take the form of trees. In this respect such models are highly unrealistic, most real networks having strongly non-tree-like neighborhoods that contain short loops, cliques, or other biconnected subgraphs. In this paper we propose and analyze a new class of random graph models that incorporates general subgraphs, allowing for non-tree-like neighborhoods while still remaining solvable for many fundamental network properties. Among other things we give solutions for the size of the giant component, the position of the phase transition at which the giant component appears, and percolation properties for both site and bond percolation on networks generated by the model.
△ Less
Submitted 10 May, 2010;
originally announced May 2010.
-
A message passing approach for general epidemic models
Authors:
Brian Karrer,
M. E. J. Newman
Abstract:
In most models of the spread of disease over contact networks it is assumed that the probabilities per unit time of disease transmission and recovery from disease are constant, implying exponential distributions of the time intervals for transmission and recovery. Time intervals for real diseases, however, have distributions that in most cases are far from exponential, which leads to disagreements…
▽ More
In most models of the spread of disease over contact networks it is assumed that the probabilities per unit time of disease transmission and recovery from disease are constant, implying exponential distributions of the time intervals for transmission and recovery. Time intervals for real diseases, however, have distributions that in most cases are far from exponential, which leads to disagreements, both qualitative and quantitative, with the models. In this paper, we study a generalized version of the SIR (susceptible-infected-recovered) model of epidemic disease that allows for arbitrary distributions of transmission and recovery times. Standard differential equation approaches cannot be used for this generalized model, but we show that the problem can be reformulated as a time-dependent message passing calculation on the appropriate contact network. The calculation is exact on trees (i.e., loopless networks) or locally tree-like networks (such as random graphs) in the large system size limit. On non-tree-like networks we show that the calculation gives a rigorous bound on the size of disease outbreaks. We demonstrate the method with applications to two specific models and the results compare favorably with numerical simulations.
△ Less
Submitted 22 July, 2010; v1 submitted 29 March, 2010;
originally announced March 2010.
-
Random graph models for directed acyclic networks
Authors:
Brian Karrer,
M. E. J. Newman
Abstract:
We study random graph models for directed acyclic graphs, an important class of networks that includes citation networks, food webs, and feed-forward neural networks among others. We propose two specific models, roughly analogous to the fixed edge number and fixed edge probability variants of traditional undirected random graphs. We calculate a number of properties of these models, including par…
▽ More
We study random graph models for directed acyclic graphs, an important class of networks that includes citation networks, food webs, and feed-forward neural networks among others. We propose two specific models, roughly analogous to the fixed edge number and fixed edge probability variants of traditional undirected random graphs. We calculate a number of properties of these models, including particularly the probability of connection between a given pair of vertices, and compare the results with real-world acyclic network data finding that theory and measurements agree surprisingly well -- far better than the often poor agreement of other random graph models with their corresponding real-world networks.
△ Less
Submitted 24 July, 2009;
originally announced July 2009.
-
Random graphs with clustering
Authors:
M. E. J. Newman
Abstract:
We offer a solution to a long-standing problem in the physics of networks, the creation of a plausible, solvable model of a network that displays clustering or transitivity -- the propensity for two neighbors of a network node also to be neighbors of one another. We show how standard random graph models can be generalized to incorporate clustering and give exact solutions for various properties…
▽ More
We offer a solution to a long-standing problem in the physics of networks, the creation of a plausible, solvable model of a network that displays clustering or transitivity -- the propensity for two neighbors of a network node also to be neighbors of one another. We show how standard random graph models can be generalized to incorporate clustering and give exact solutions for various properties of the resulting networks, including sizes of network components, size of the giant component if there is one, position of the phase transition at which the giant component forms, and position of the phase transition for percolation on the network.
△ Less
Submitted 23 March, 2009;
originally announced March 2009.
-
Random hypergraphs and their applications
Authors:
Gourab Ghoshal,
Vinko Zlatic,
Guido Caldarelli,
M. E. J. Newman
Abstract:
In the last few years we have witnessed the emergence, primarily in on-line communities, of new types of social networks that require for their representation more complex graph structures than have been employed in the past. One example is the folksonomy, a tripartite structure of users, resources, and tags -- labels collaboratively applied by the users to the resources in order to impart meani…
▽ More
In the last few years we have witnessed the emergence, primarily in on-line communities, of new types of social networks that require for their representation more complex graph structures than have been employed in the past. One example is the folksonomy, a tripartite structure of users, resources, and tags -- labels collaboratively applied by the users to the resources in order to impart meaningful structure on an otherwise undifferentiated database. Here we propose a mathematical model of such tripartite structures which represents them as random hypergraphs. We show that it is possible to calculate many properties of this model exactly in the limit of large network size and we compare the results against observations of a real folksonomy, that of the on-line photography web site Flickr. We show that in some cases the model matches the properties of the observed network well, while in others there are significant differences, which we find to be attributable to the practice of multiple tagging, i.e., the application by a single user of many tags to one resource, or one tag to many resources.
△ Less
Submitted 2 March, 2009;
originally announced March 2009.
-
Random acyclic networks
Authors:
Brian Karrer,
M. E. J. Newman
Abstract:
Directed acyclic graphs are a fundamental class of networks that includes citation networks, food webs, and family trees, among others. Here we define a random graph model for directed acyclic graphs and give solutions for a number of the model's properties, including connection probabilities and component sizes, as well as a fast algorithm for simulating the model on a computer. We compare the…
▽ More
Directed acyclic graphs are a fundamental class of networks that includes citation networks, food webs, and family trees, among others. Here we define a random graph model for directed acyclic graphs and give solutions for a number of the model's properties, including connection probabilities and component sizes, as well as a fast algorithm for simulating the model on a computer. We compare the predictions of the model to a real-world network of citations between physics papers and find surprisingly good agreement, suggesting that the structure of the real network may be quite well described by the random graph.
△ Less
Submitted 23 February, 2009;
originally announced February 2009.