-
General linear threshold models with application to influence maximization
Authors:
Alexander Kagan,
Elizaveta Levina,
Ji Zhu
Abstract:
A number of models have been developed for information spread through networks, often for solving the Influence Maximization (IM) problem. IM is the task of choosing a fixed number of nodes to "seed" with information in order to maximize the spread of this information through the network, with applications in areas such as marketing and public health. Most methods for this problem rely heavily on…
▽ More
A number of models have been developed for information spread through networks, often for solving the Influence Maximization (IM) problem. IM is the task of choosing a fixed number of nodes to "seed" with information in order to maximize the spread of this information through the network, with applications in areas such as marketing and public health. Most methods for this problem rely heavily on the assumption of known strength of connections between network members (edge weights), which is often unrealistic. In this paper, we develop a likelihood-based approach to estimate edge weights from the fully and partially observed information diffusion paths. We also introduce a broad class of information diffusion models, the general linear threshold (GLT) model, which generalizes the well-known linear threshold (LT) model by allowing arbitrary distributions of node activation thresholds. We then show our weight estimator is consistent under the GLT and some mild assumptions. For the special case of the standard LT model, we also present a much faster expectation-maximization approach for weight estimation. Finally, we prove that for the GLT models, the IM problem can be solved by a natural greedy algorithm with standard optimality guarantees if all node threshold distributions have concave cumulative distribution functions. Extensive experiments on synthetic and real-world networks demonstrate that the flexibility in the choice of threshold distribution combined with the estimation of edge weights significantly improves the quality of IM solutions, spread prediction, and the estimates of the node activation probabilities.
△ Less
Submitted 13 November, 2024;
originally announced November 2024.
-
Mesoscale two-sample testing for network data
Authors:
Peter W. MacDonald,
Elizaveta Levina,
Ji Zhu
Abstract:
Networks arise naturally in many scientific fields as a representation of pairwise connections. Statistical network analysis has most often considered a single large network, but it is common in a number of applications, for example, neuroimaging, to observe multiple networks on a shared node set. When these networks are grouped by case-control status or another categorical covariate, the classica…
▽ More
Networks arise naturally in many scientific fields as a representation of pairwise connections. Statistical network analysis has most often considered a single large network, but it is common in a number of applications, for example, neuroimaging, to observe multiple networks on a shared node set. When these networks are grouped by case-control status or another categorical covariate, the classical statistical question of two-sample comparison arises. In this work, we address the problem of testing for statistically significant differences in a given arbitrary subset of connections. This general framework allows an analyst to focus on a single node, a specific region of interest, or compare whole networks. Our ability to conduct "mesoscale" testing on a meaningful group of edges is particularly relevant for applications such as neuroimaging and distinguishes our approach from prior work, which tends to focus either on a single node or the whole network. In this mesoscale setting, we develop statistically sound projection-based tests for two-sample comparison in both weighted and binary edge networks. Our approach can leverage all available network information, and learn informative projections which improve testing power when low-dimensional latent network structure is present.
△ Less
Submitted 22 October, 2024;
originally announced October 2024.
-
Heterogeneous Treatment Effects under Network Interference: A Nonparametric Approach Based on Node Connectivity
Authors:
Heejong Bong,
Colin B. Fogarty,
Elizaveta Levina,
Ji Zhu
Abstract:
In network settings, interference between units makes causal inference more challenging as outcomes may depend on the treatments received by others in the network. Typical estimands in network settings focus on treatment effects aggregated across individuals in the population. We propose a framework for estimating node-wise counterfactual means, allowing for more granular insights into the impact…
▽ More
In network settings, interference between units makes causal inference more challenging as outcomes may depend on the treatments received by others in the network. Typical estimands in network settings focus on treatment effects aggregated across individuals in the population. We propose a framework for estimating node-wise counterfactual means, allowing for more granular insights into the impact of network structure on treatment effect heterogeneity. We develop a doubly robust and non-parametric estimation procedure, KECENI (Kernel Estimation of Causal Effect under Network Interference), which offers consistency and asymptotic normality under network dependence. The utility of this method is demonstrated through an application to microfinance data, revealing the impact of network characteristics on treatment effects.
△ Less
Submitted 14 November, 2024; v1 submitted 15 October, 2024;
originally announced October 2024.
-
Computational Inference for Directions in Canonical Correlation Analysis
Authors:
Daniel Kessler,
Elizaveta Levina
Abstract:
Canonical Correlation Analysis (CCA) is a method for analyzing pairs of random vectors; it learns a sequence of paired linear transformations such that the resultant canonical variates are maximally correlated within pairs while uncorrelated across pairs. CCA outputs both canonical correlations as well as the canonical directions which define the transformations. While inference for canonical corr…
▽ More
Canonical Correlation Analysis (CCA) is a method for analyzing pairs of random vectors; it learns a sequence of paired linear transformations such that the resultant canonical variates are maximally correlated within pairs while uncorrelated across pairs. CCA outputs both canonical correlations as well as the canonical directions which define the transformations. While inference for canonical correlations is well developed, conducting inference for canonical directions is more challenging and not well-studied, but is key to interpretability. We propose a computational bootstrap method (combootcca) for inference on CCA directions. We conduct thorough simulation studies that range from simple and well-controlled to complex but realistic and validate the statistical properties of combootcca while comparing it to several competitors. We also apply the combootcca method to a brain imaging dataset and discover linked patterns in brain connectivity and behavioral scores.
△ Less
Submitted 22 August, 2023;
originally announced August 2023.
-
Fair Information Spread on Social Networks with Community Structure
Authors:
Octavio Mesner,
Elizaveta Levina,
Ji Zhu
Abstract:
Information spread through social networks is ubiquitous. Influence maximiza- tion (IM) algorithms aim to identify individuals who will generate the greatest spread through the social network if provided with information, and have been largely devel- oped with marketing in mind. In social networks with community structure, which are very common, IM algorithms focused solely on maximizing spread ma…
▽ More
Information spread through social networks is ubiquitous. Influence maximiza- tion (IM) algorithms aim to identify individuals who will generate the greatest spread through the social network if provided with information, and have been largely devel- oped with marketing in mind. In social networks with community structure, which are very common, IM algorithms focused solely on maximizing spread may yield signifi- cant disparities in information coverage between communities, which is problematic in settings such as public health messaging. While some IM algorithms aim to remedy disparity in information coverage using node attributes, none use the empirical com- munity structure within the network itself, which may be beneficial since communities directly affect the spread of information. Further, the use of empirical network struc- ture allows us to leverage community detection techniques, making it possible to run fair-aware algorithms when there are no relevant node attributes available, or when node attributes do not accurately capture network community structure. In contrast to other fair IM algorithms, this work relies on fitting a model to the social network which is then used to determine a seed allocation strategy for optimal fair information spread. We develop an algorithm to determine optimal seed allocations for expected fair coverage, defined through maximum entropy, provide some theoretical guarantees under appropriate conditions, and demonstrate its empirical accuracy on both simu- lated and real networks. Because this algorithm relies on a fitted network model and not on the network directly, it is well-suited for partially observed and noisy social networks.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
A pseudo-likelihood approach to community detection in weighted networks
Authors:
Andressa Cerqueira,
Elizaveta Levina
Abstract:
Community structure is common in many real networks, with nodes clustered in groups sharing the same connections patterns. While many community detection methods have been developed for networks with binary edges, few of them are applicable to networks with weighted edges, which are common in practice. We propose a pseudo-likelihood community estimation algorithm derived under the weighted stochas…
▽ More
Community structure is common in many real networks, with nodes clustered in groups sharing the same connections patterns. While many community detection methods have been developed for networks with binary edges, few of them are applicable to networks with weighted edges, which are common in practice. We propose a pseudo-likelihood community estimation algorithm derived under the weighted stochastic block model for networks with normally distributed edge weights, extending the pseudo-likelihood algorithm for binary networks, which offers some of the best combinations of accuracy and computational efficiency. We prove that the estimates obtained by the proposed method are consistent under the assumption of homogeneous networks, a weighted analogue of the planted partition model, and show that they work well in practice for both homogeneous and heterogeneous networks. We illustrate the method on simulated networks and on a fMRI dataset, where edge weights represent connectivity between brain regions and are expected to be close to normal in distribution by construction.
△ Less
Submitted 10 March, 2023;
originally announced March 2023.
-
Conformal Prediction for Network-Assisted Regression
Authors:
Robert Lunde,
Elizaveta Levina,
Ji Zhu
Abstract:
An important problem in network analysis is predicting a node attribute using both network covariates, such as graph embedding coordinates or local subgraph counts, and conventional node covariates, such as demographic characteristics. While standard regression methods that make use of both types of covariates may be used for prediction, statistical inference is complicated by the fact that the no…
▽ More
An important problem in network analysis is predicting a node attribute using both network covariates, such as graph embedding coordinates or local subgraph counts, and conventional node covariates, such as demographic characteristics. While standard regression methods that make use of both types of covariates may be used for prediction, statistical inference is complicated by the fact that the nodal summary statistics are often dependent in complex ways. We show that under a mild joint exchangeability assumption, a network analog of conformal prediction achieves finite sample validity for a wide range of network covariates. We also show that a form of asymptotic conditional validity is achievable. The methods are illustrated on both simulated networks and a citation network dataset.
△ Less
Submitted 22 February, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
Predicting Responses from Weighted Networks with Node Covariates in an Application to Neuroimaging
Authors:
Daniel Kessler,
Keith Levin,
Elizaveta Levina
Abstract:
We consider the setting where many networks are observed on a common node set, and each observation comprises edge weights of a network, covariates observed at each node, and an overall response. The goal is to use the edge weights and node covariates to predict the response while identifying an interpretable set of predictive features. Our motivating application is neuroimaging, where edge weight…
▽ More
We consider the setting where many networks are observed on a common node set, and each observation comprises edge weights of a network, covariates observed at each node, and an overall response. The goal is to use the edge weights and node covariates to predict the response while identifying an interpretable set of predictive features. Our motivating application is neuroimaging, where edge weights encode functional connectivity measured between brain regions, node covariates encode task activations at each brain region, and the response is disease status or score on a behavioral task. We propose an approach that constructs feature groups based on assumed community structure (naturally occurring in neuroimaging applications). We propose two feature grouping schemes that incorporate both edge weights and node covariates, and we derive algorithms for optimization using an overlapping group LASSO penalty. Empirical results on synthetic data show that our method, relative to competing approaches, has similar or improved prediction error along with superior support recovery, enabling a more interpretable and potentially more accurate understanding of the underlying process. We also apply the method to neuroimaging data from the Human Connectome Project. Our approach is widely applicable in neuroimaging where interpretability is highly desired.
△ Less
Submitted 22 August, 2023; v1 submitted 31 October, 2022;
originally announced October 2022.
-
Latent process models for functional network data
Authors:
Peter W. MacDonald,
Elizaveta Levina,
Ji Zhu
Abstract:
Network data are often sampled with auxiliary information or collected through the observation of a complex system over time, leading to multiple network snapshots indexed by a continuous variable. Many methods in statistical network analysis are traditionally designed for a single network, and can be applied to an aggregated network in this setting, but that approach can miss important functional…
▽ More
Network data are often sampled with auxiliary information or collected through the observation of a complex system over time, leading to multiple network snapshots indexed by a continuous variable. Many methods in statistical network analysis are traditionally designed for a single network, and can be applied to an aggregated network in this setting, but that approach can miss important functional structure. Here we develop an approach to estimating the expected network explicitly as a function of a continuous index, be it time or another indexing variable. We parameterize the network expectation through low dimensional latent processes, whose components we represent with a fixed, finite-dimensional functional basis. We derive a gradient descent estimation algorithm, establish theoretical guarantees for recovery of the low dimensional structure, compare our method to competitors, and apply it to a data set of international political interactions over time, showing our proposed method to adapt well to data, outperform competitors, and provide interpretable and meaningful results.
△ Less
Submitted 15 July, 2024; v1 submitted 13 October, 2022;
originally announced October 2022.
-
Network resampling for estimating uncertainty
Authors:
Qianhua Shan,
Elizaveta Levina
Abstract:
With network data becoming ubiquitous in many applications, many models and algorithms for network analysis have been proposed. Yet methods for providing uncertainty estimates in addition to point estimates of network parameters are much less common. While bootstrap and other resampling procedures have been an effective general tool for estimating uncertainty from i.i.d. samples, adapting them to…
▽ More
With network data becoming ubiquitous in many applications, many models and algorithms for network analysis have been proposed. Yet methods for providing uncertainty estimates in addition to point estimates of network parameters are much less common. While bootstrap and other resampling procedures have been an effective general tool for estimating uncertainty from i.i.d. samples, adapting them to networks is highly nontrivial. In this work, we study three different network resampling procedures for uncertainty estimation, and propose a general algorithm to construct confidence intervals for network parameters through network resampling. We also propose an algorithm for selecting the sampling fraction, which has a substantial effect on performance. We find that, unsurprisingly, no one procedure is empirically best for all tasks, but that selecting an appropriate sampling fraction substantially improves performance in many cases. We illustrate this on simulated networks and on Facebook data.
△ Less
Submitted 27 June, 2022;
originally announced June 2022.
-
Selective Inference for Sparse Multitask Regression with Applications in Neuroimaging
Authors:
Snigdha Panigrahi,
Natasha Stewart,
Chandra Sekhar Sripada,
Elizaveta Levina
Abstract:
Multi-task learning is frequently used to model a set of related response variables from the same set of features, improving predictive performance and modeling accuracy relative to methods that handle each response variable separately. Despite the potential of multi-task learning to yield more powerful inference than single-task alternatives, prior work in this area has largely omitted uncertaint…
▽ More
Multi-task learning is frequently used to model a set of related response variables from the same set of features, improving predictive performance and modeling accuracy relative to methods that handle each response variable separately. Despite the potential of multi-task learning to yield more powerful inference than single-task alternatives, prior work in this area has largely omitted uncertainty quantification. Our focus in this paper is a common multi-task problem in neuroimaging, where the goal is to understand the relationship between multiple cognitive task scores (or other subject-level assessments) and brain connectome data collected from imaging. We propose a framework for selective inference to address this problem, with the flexibility to: (i) jointly identify the relevant covariates for each task through a sparsity-inducing penalty, and (ii) conduct valid inference in a model based on the estimated sparsity structure. Our framework offers a new conditional procedure for inference, based on a refinement of the selection event that yields a tractable selection-adjusted likelihood. This gives an approximate system of estimating equations for maximum likelihood inference, solvable via a single convex optimization problem, and enables us to efficiently form confidence intervals with approximately the correct coverage. Applied to both simulated data and data from the Adolescent Brain Cognitive Development (ABCD) study, our selective inference methods yield tighter confidence intervals than commonly used alternatives, such as data splitting. We also demonstrate through simulations that multi-task learning with selective inference can more accurately recover true signals than single-task methods.
△ Less
Submitted 9 August, 2023; v1 submitted 27 May, 2022;
originally announced May 2022.
-
Latent space models for multiplex networks with shared structure
Authors:
Peter W. MacDonald,
Elizaveta Levina,
Ji Zhu
Abstract:
Latent space models are frequently used for modeling single-layer networks and include many popular special cases, such as the stochastic block model and the random dot product graph. However, they are not well-developed for more complex network structures, which are becoming increasingly common in practice. Here we propose a new latent space model for multiplex networks: multiple, heterogeneous n…
▽ More
Latent space models are frequently used for modeling single-layer networks and include many popular special cases, such as the stochastic block model and the random dot product graph. However, they are not well-developed for more complex network structures, which are becoming increasingly common in practice. Here we propose a new latent space model for multiplex networks: multiple, heterogeneous networks observed on a shared node set. Multiplex networks can represent a network sample with shared node labels, a network evolving over time, or a network with multiple types of edges. The key feature of our model is that it learns from data how much of the network structure is shared between layers and pools information across layers as appropriate. We establish identifiability, develop a fitting procedure using convex optimization in combination with a nuclear norm penalty, and prove a guarantee of recovery for the latent positions as long as there is sufficient separation between the shared and the individual latent subspaces. We compare the model to competing methods in the literature on simulated networks and on a multiplex network describing the worldwide trade of agricultural products.
△ Less
Submitted 7 July, 2021; v1 submitted 28 December, 2020;
originally announced December 2020.
-
Overlapping community detection in networks via sparse spectral decomposition
Authors:
Jesús Arroyo,
Elizaveta Levina
Abstract:
We consider the problem of estimating overlapping community memberships in a network, where each node can belong to multiple communities. More than a few communities per node are difficult to both estimate and interpret, so we focus on sparse node membership vectors. Our algorithm is based on sparse principal subspace estimation with iterative thresholding. The method is computationally efficient,…
▽ More
We consider the problem of estimating overlapping community memberships in a network, where each node can belong to multiple communities. More than a few communities per node are difficult to both estimate and interpret, so we focus on sparse node membership vectors. Our algorithm is based on sparse principal subspace estimation with iterative thresholding. The method is computationally efficient, with a computational cost equivalent to estimating the leading eigenvectors of the adjacency matrix, and does not require an additional clustering step, unlike spectral clustering methods. We show that a fixed point of the algorithm corresponds to correct node memberships under a version of the stochastic block model. The methods are evaluated empirically on simulated and real-world networks, showing good statistical performance and computational efficiency.
△ Less
Submitted 15 February, 2021; v1 submitted 20 September, 2020;
originally announced September 2020.
-
Community models for networks observed through edge nominations
Authors:
Tianxi Li,
Elizaveta Levina,
Ji Zhu
Abstract:
Communities are a common and widely studied structure in networks, typically under the assumption that the network is fully and correctly observed. In practice, network data are often collected by querying nodes about their connections. In some settings, all edges of a sampled node will be recorded, and in others, a node may be asked to name its connections. These sampling mechanisms introduce noi…
▽ More
Communities are a common and widely studied structure in networks, typically under the assumption that the network is fully and correctly observed. In practice, network data are often collected by querying nodes about their connections. In some settings, all edges of a sampled node will be recorded, and in others, a node may be asked to name its connections. These sampling mechanisms introduce noise and bias which can obscure the community structure and invalidate assumptions underlying standard community detection methods. We propose a general model for a class of network sampling mechanisms based on recording edges via querying nodes, designed to improve community detection for network data collected in this fashion. We model edge sampling probabilities as a function of both individual preferences and community parameters, and show community detection can be performed by spectral clustering under this general class of models. We also propose, as a special case of the general framework, a parametric model for directed networks we call the nomination stochastic block model, which allows for meaningful parameter interpretations and can be fitted by the method of moments. Both spectral clustering and the method of moments in this case are computationally efficient and come with theoretical guarantees of consistency. We evaluate the proposed model in simulation studies on both unweighted and weighted networks and apply it to a faculty hiring dataset, discovering a meaningful hierarchy of communities among US business schools.
△ Less
Submitted 18 March, 2021; v1 submitted 9 August, 2020;
originally announced August 2020.
-
Simultaneous prediction and community detection for networks with application to neuroimaging
Authors:
Jesús Arroyo,
Elizaveta Levina
Abstract:
Community structure in networks is observed in many different domains, and unsupervised community detection has received a lot of attention in the literature. Increasingly the focus of network analysis is shifting towards using network information in some other prediction or inference task rather than just analyzing the network itself. In particular, in neuroimaging applications brain networks are…
▽ More
Community structure in networks is observed in many different domains, and unsupervised community detection has received a lot of attention in the literature. Increasingly the focus of network analysis is shifting towards using network information in some other prediction or inference task rather than just analyzing the network itself. In particular, in neuroimaging applications brain networks are available for multiple subjects and the goal is often to predict a phenotype of interest. Community structure is well known to be a feature of brain networks, typically corresponding to different regions of the brain responsible for different functions. There are standard parcellations of the brain into such regions, usually obtained by applying clustering methods to brain connectomes of healthy subjects. However, when the goal is predicting a phenotype or distinguishing between different conditions, these static communities from an unrelated set of healthy subjects may not be the most useful for prediction. Here we present a method for supervised community detection, aiming to find a partition of the network into communities that is most useful for predicting a particular response. We use a block-structured regularization penalty combined with a prediction loss function, and compute the solution with a combination of a spectral method and an ADMM optimization algorithm. We show that the spectral clustering method recovers the correct communities under a weighted stochastic block model. The method performs well on both simulated and real brain networks, providing support for the idea of task-dependent brain regions.
△ Less
Submitted 27 February, 2020; v1 submitted 5 February, 2020;
originally announced February 2020.
-
Bootstrapping Networks with Latent Space Structure
Authors:
Keith Levin,
Elizaveta Levina
Abstract:
A core problem in statistical network analysis is to develop network analogues of classical techniques. The problem of bootstrapping network data stands out as especially challenging, since typically one observes only a single network, rather than a sample. Here we propose two methods for obtaining bootstrap samples for networks drawn from latent space models. The first method generates bootstrap…
▽ More
A core problem in statistical network analysis is to develop network analogues of classical techniques. The problem of bootstrapping network data stands out as especially challenging, since typically one observes only a single network, rather than a sample. Here we propose two methods for obtaining bootstrap samples for networks drawn from latent space models. The first method generates bootstrap replicates of network statistics that can be represented as U-statistics in the latent positions, and avoids actually constructing new bootstrapped networks. The second method generates bootstrap replicates of whole networks, and thus can be used for bootstrapping any network function. Commonly studied network quantities that can be represented as U-statistics include many popular summaries, such as average degree and subgraph counts, but other equally popular summaries, such as the clustering coefficient, are not expressible as U-statistics and thus require the second bootstrap method. Under the assumption of a random dot product graph, a type of latent space network model, we show consistency of the proposed bootstrap methods. We give motivating examples throughout and demonstrate the effectiveness of our methods on synthetic data.
△ Less
Submitted 11 October, 2021; v1 submitted 24 July, 2019;
originally announced July 2019.
-
High-dimensional Gaussian graphical model for network-linked data
Authors:
Tianxi Li,
Cheng Qian,
Elizaveta Levina,
Ji Zhu
Abstract:
Graphical models are commonly used to represent conditional dependence relationships between variables. There are multiple methods available for exploring them from high-dimensional data, but almost all of them rely on the assumption that the observations are independent and identically distributed. At the same time, observations connected by a network are becoming increasingly common, and tend to…
▽ More
Graphical models are commonly used to represent conditional dependence relationships between variables. There are multiple methods available for exploring them from high-dimensional data, but almost all of them rely on the assumption that the observations are independent and identically distributed. At the same time, observations connected by a network are becoming increasingly common, and tend to violate these assumptions. Here we develop a Gaussian graphical model for observations connected by a network with potentially different mean vectors, varying smoothly over the network. We propose an efficient estimation algorithm and demonstrate its effectiveness on both simulated and real data, obtaining meaningful and interpretable results on a statistics coauthorship network. We also prove that our method estimates both the inverse covariance matrix and the corresponding graph structure correctly under the assumption of network âcohesionâ, which refers to the empirically observed phenomenon of network neighbors sharing similar traits.
△ Less
Submitted 21 April, 2020; v1 submitted 4 July, 2019;
originally announced July 2019.
-
Recovering shared structure from multiple networks with unknown edge distributions
Authors:
Keith Levin,
Asad Lodhia,
Elizaveta Levina
Abstract:
In increasingly many settings, data sets consist of multiple samples from a population of networks, with vertices aligned across these networks. For example, brain connectivity networks in neuroscience consist of measures of interaction between brain regions that have been aligned to a common template. We consider the setting where the observed networks have a shared expectation, but may differ in…
▽ More
In increasingly many settings, data sets consist of multiple samples from a population of networks, with vertices aligned across these networks. For example, brain connectivity networks in neuroscience consist of measures of interaction between brain regions that have been aligned to a common template. We consider the setting where the observed networks have a shared expectation, but may differ in the noise structure on their edges. Our approach exploits the shared mean structure to denoise edge-level measurements of the observed networks and estimate the underlying population-level parameters. We also explore the extent to which edge-level errors influence estimation and downstream inference. We establish a finite-sample concentration inequality for the low-rank eigenvalue truncation of a random weighted adjacency matrix that may be of independent interest. The proposed approach is illustrated on synthetic networks and on data from an fMRI study of schizophrenia.
△ Less
Submitted 8 May, 2021; v1 submitted 12 June, 2019;
originally announced June 2019.
-
Graph-aware Modeling of Brain Connectivity Networks
Authors:
Yura Kim,
Daniel Kessler,
Elizaveta Levina
Abstract:
Functional connections in the brain are frequently represented by weighted networks, with nodes representing locations in the brain, and edges representing the strength of connectivity between these locations. One challenge in analyzing such data is that inference at the individual edge level is not particularly biologically meaningful; interpretation is more useful at the level of so-called funct…
▽ More
Functional connections in the brain are frequently represented by weighted networks, with nodes representing locations in the brain, and edges representing the strength of connectivity between these locations. One challenge in analyzing such data is that inference at the individual edge level is not particularly biologically meaningful; interpretation is more useful at the level of so-called functional regions, or groups of nodes and connections between them; this is often called "graph-aware" inference in the neuroimaging literature. However, pooling over functional regions leads to significant loss of information and lower accuracy. Another challenge is correlation among edge weights within a subject, which makes inference based on independence assumptions unreliable. We address both these challenges with a linear mixed effects model, which accounts for functional regions and for edge dependence, while still modeling individual edge weights to avoid loss of information. The model allows for comparing two populations, such as patients and healthy controls, both at the functional regions level and at individual edge level, leading to biologically meaningful interpretations. We fit this model to a resting state fMRI data on schizophrenics and healthy controls, obtaining interpretable results consistent with the schizophrenia literature.
△ Less
Submitted 26 September, 2022; v1 submitted 5 March, 2019;
originally announced March 2019.
-
Hierarchical community detection by recursive partitioning
Authors:
Tianxi Li,
Lihua Lei,
Sharmodeep Bhattacharyya,
Koen Van den Berge,
Purnamrita Sarkar,
Peter J. Bickel,
Elizaveta Levina
Abstract:
The problem of community detection in networks is usually formulated as finding a single partition of the network into some "correct" number of communities. We argue that it is more interpretable and in some regimes more accurate to construct a hierarchical tree of communities instead. This can be done with a simple top-down recursive partitioning algorithm, starting with a single community and se…
▽ More
The problem of community detection in networks is usually formulated as finding a single partition of the network into some "correct" number of communities. We argue that it is more interpretable and in some regimes more accurate to construct a hierarchical tree of communities instead. This can be done with a simple top-down recursive partitioning algorithm, starting with a single community and separating the nodes into two communities by spectral clustering repeatedly, until a stopping rule suggests there are no further communities. This class of algorithms is model-free, computationally efficient, and requires no tuning other than selecting a stopping rule. We show that there are regimes where this approach outperforms K-way spectral clustering, and propose a natural framework for analyzing the algorithm's theoretical performance, the binary tree stochastic block model. Under this model, we prove that the algorithm correctly recovers the entire community tree under relatively mild assumptions. We apply the algorithm to a gene network based on gene co-occurrence in 1580 research papers on anemia, and identify six clusters of genes in a meaningful hierarchy. We also illustrate the algorithm on a dataset of statistics papers.
△ Less
Submitted 14 May, 2020; v1 submitted 2 October, 2018;
originally announced October 2018.
-
Link prediction for egocentrically sampled networks
Authors:
Yun-Jhong Wu,
Elizaveta Levina,
Ji Zhu
Abstract:
Link prediction in networks is typically accomplished by estimating or ranking the probabilities of edges for all pairs of nodes. In practice, especially for social networks, the data are often collected by egocentric sampling, which means selecting a subset of nodes and recording all of their edges. This sampling mechanism requires different prediction tools than the typical assumption of links m…
▽ More
Link prediction in networks is typically accomplished by estimating or ranking the probabilities of edges for all pairs of nodes. In practice, especially for social networks, the data are often collected by egocentric sampling, which means selecting a subset of nodes and recording all of their edges. This sampling mechanism requires different prediction tools than the typical assumption of links missing at random. We propose a new computationally efficient link prediction algorithm for egocentrically sampled networks, which estimates the underlying probability matrix by estimating its row space. For networks created by sampling rows, our method outperforms many popular link prediction and graphon estimation techniques.
△ Less
Submitted 11 March, 2018;
originally announced March 2018.
-
Generalized linear models with low rank effects for network data
Authors:
Yun-Jhong Wu,
Elizaveta Levina,
Ji Zhu
Abstract:
Networks are a useful representation for data on connections between units of interests, but the observed connections are often noisy and/or include missing values. One common approach to network analysis is to treat the network as a realization from a random graph model, and estimate the underlying edge probability matrix, which is sometimes referred to as network denoising. Here we propose a gen…
▽ More
Networks are a useful representation for data on connections between units of interests, but the observed connections are often noisy and/or include missing values. One common approach to network analysis is to treat the network as a realization from a random graph model, and estimate the underlying edge probability matrix, which is sometimes referred to as network denoising. Here we propose a generalized linear model with low rank effects to model network edges. This model can be applied to various types of networks, including directed and undirected, binary and weighted, and it can naturally utilize additional information such as node and/or edge covariates. We develop an efficient projected gradient ascent algorithm to fit the model, establish asymptotic consistency, and demonstrate empirical performance of the method on both simulated and real networks.
△ Less
Submitted 18 May, 2017;
originally announced May 2017.
-
Network classification with applications to brain connectomics
Authors:
Jesús D. Arroyo-Relión,
Daniel Kessler,
Elizaveta Levina,
Stephan F. Taylor
Abstract:
While statistical analysis of a single network has received a lot of attention in recent years, with a focus on social networks, analysis of a sample of networks presents its own challenges which require a different set of analytic tools. Here we study the problem of classification of networks with labeled nodes, motivated by applications in neuroimaging. Brain networks are constructed from imagin…
▽ More
While statistical analysis of a single network has received a lot of attention in recent years, with a focus on social networks, analysis of a sample of networks presents its own challenges which require a different set of analytic tools. Here we study the problem of classification of networks with labeled nodes, motivated by applications in neuroimaging. Brain networks are constructed from imaging data to represent functional connectivity between regions of the brain, and previous work has shown the potential of such networks to distinguish between various brain disorders, giving rise to a network classification problem. Existing approaches tend to either treat all edge weights as a long vector, ignoring the network structure, or focus on graph topology as represented by summary measures while ignoring the edge weights. Our goal is to design a classification method that uses both the individual edge information and the network structure of the data in a computationally efficient way, and that can produce a parsimonious and interpretable representation of differences in brain connectivity patterns between classes. We propose a graph classification method that uses edge weights as predictors but incorporates the network nature of the data via penalties that promote sparsity in the number of nodes, in addition to the usual sparsity penalties that encourage selection of edges. We implement the method via efficient convex optimization and provide a detailed analysis of data from two fMRI studies of schizophrenia.
△ Less
Submitted 1 February, 2019; v1 submitted 27 January, 2017;
originally announced January 2017.
-
Network cross-validation by edge sampling
Authors:
Tianxi Li,
Elizaveta Levina,
Ji Zhu
Abstract:
While many statistical models and methods are now available for network analysis, resampling network data remains a challenging problem. Cross-validation is a useful general tool for model selection and parameter tuning, but is not directly applicable to networks since splitting network nodes into groups requires deleting edges and destroys some of the network structure. Here we propose a new netw…
▽ More
While many statistical models and methods are now available for network analysis, resampling network data remains a challenging problem. Cross-validation is a useful general tool for model selection and parameter tuning, but is not directly applicable to networks since splitting network nodes into groups requires deleting edges and destroys some of the network structure. Here we propose a new network resampling strategy based on splitting node pairs rather than nodes applicable to cross-validation for a wide range of network model selection tasks. We provide a theoretical justification for our method in a general setting and examples of how our method can be used in specific network model selection and parameter tuning tasks. Numerical results on simulated networks and on a citation network of statisticians show that this cross-validation approach works well for model selection.
△ Less
Submitted 1 May, 2020; v1 submitted 14 December, 2016;
originally announced December 2016.
-
Prediction models for network-linked data
Authors:
Tianxi Li,
Elizaveta Levina,
Ji Zhu
Abstract:
Prediction algorithms typically assume the training data are independent samples, but in many modern applications samples come from individuals connected by a network. For example, in adolescent health studies of risk-taking behaviors, information on the subjects' social network is often available and plays an important role through network cohesion, the empirically observed phenomenon of friends…
▽ More
Prediction algorithms typically assume the training data are independent samples, but in many modern applications samples come from individuals connected by a network. For example, in adolescent health studies of risk-taking behaviors, information on the subjects' social network is often available and plays an important role through network cohesion, the empirically observed phenomenon of friends behaving similarly. Taking cohesion into account in prediction models should allow us to improve their performance. Here we propose a network-based penalty on individual node effects to encourage similarity between predictions for linked nodes, and show that incorporating it into prediction leads to improvement over traditional models both theoretically and empirically when network cohesion is present. The penalty can be used with many loss-based prediction methods, such as regression, generalized linear models, and Cox's proportional hazard model. Applications to predicting levels of recreational activity and marijuana usage among teenagers from the AddHealth study based on both demographic covariates and friendship networks are discussed in detail and show that our approach to taking friendships into account can significantly improve predictions of behavior while providing interpretable estimates of covariate effects.
△ Less
Submitted 25 June, 2018; v1 submitted 3 February, 2016;
originally announced February 2016.
-
Estimating network edge probabilities by neighborhood smoothing
Authors:
Yuan Zhang,
Elizaveta Levina,
Ji Zhu
Abstract:
The estimation of probabilities of network edges from the observed adjacency matrix has important applications to predicting missing links and network denoising. It has usually been addressed by estimating the graphon, a function that determines the matrix of edge probabilities, but this is ill-defined without strong assumptions on the network structure. Here we propose a novel computationally eff…
▽ More
The estimation of probabilities of network edges from the observed adjacency matrix has important applications to predicting missing links and network denoising. It has usually been addressed by estimating the graphon, a function that determines the matrix of edge probabilities, but this is ill-defined without strong assumptions on the network structure. Here we propose a novel computationally efficient method, based on neighborhood smoothing to estimate the expectation of the adjacency matrix directly, without making the structural assumptions that graphon estimation requires. The neighborhood smoothing method requires little tuning, has a competitive mean-squared error rate, and outperforms many benchmark methods on link prediction in simulated and real networks.
△ Less
Submitted 8 July, 2017; v1 submitted 29 September, 2015;
originally announced September 2015.
-
Estimating heterogeneous graphical models for discrete data with an application to roll call voting
Authors:
Jian Guo,
Jie Cheng,
Elizaveta Levina,
George Michailidis,
Ji Zhu
Abstract:
We consider the problem of jointly estimating a collection of graphical models for discrete data, corresponding to several categories that share some common structure. An example for such a setting is voting records of legislators on different issues, such as defense, energy, and healthcare. We develop a Markov graphical model to characterize the heterogeneous dependence structures arising from su…
▽ More
We consider the problem of jointly estimating a collection of graphical models for discrete data, corresponding to several categories that share some common structure. An example for such a setting is voting records of legislators on different issues, such as defense, energy, and healthcare. We develop a Markov graphical model to characterize the heterogeneous dependence structures arising from such data. The model is fitted via a joint estimation method that preserves the underlying common graph structure, but also allows for differences between the networks. The method employs a group penalty that targets the common zero interaction effects across all the networks. We apply the method to describe the internal networks of the U.S. Senate on several important issues. Our analysis reveals individual structure for each issue, distinct from the underlying well-known bipartisan structure common to all categories which we are able to extract separately. We also establish consistency of the proposed method both for parameter estimation and model selection, and evaluate its numerical performance on a number of simulated examples.
△ Less
Submitted 16 September, 2015;
originally announced September 2015.
-
Community Detection in Networks with Node Features
Authors:
Yuan Zhang,
Elizaveta Levina,
Ji Zhu
Abstract:
Many methods have been proposed for community detection in networks, but most of them do not take into account additional information on the nodes that is often available in practice. In this paper, we propose a new joint community detection criterion that uses both the network edge information and the node features to detect community structures. One advantage our method has over existing joint d…
▽ More
Many methods have been proposed for community detection in networks, but most of them do not take into account additional information on the nodes that is often available in practice. In this paper, we propose a new joint community detection criterion that uses both the network edge information and the node features to detect community structures. One advantage our method has over existing joint detection approaches is the flexibility of learning the impact of different features which may differ across communities. Another advantage is the flexibility of choosing the amount of influence the feature information has on communities. The method is asymptotically consistent under the block model with additional assumptions on the feature distributions, and performs well on simulated and real networks.
△ Less
Submitted 3 September, 2015;
originally announced September 2015.
-
Estimating the number of communities in networks by spectral methods
Authors:
Can M. Le,
Elizaveta Levina
Abstract:
Community detection is a fundamental problem in network analysis with many methods available to estimate communities. Most of these methods assume that the number of communities is known, which is often not the case in practice. We study a simple and very fast method for estimating the number of communities based on the spectral properties of certain graph operators, such as the non-backtracking m…
▽ More
Community detection is a fundamental problem in network analysis with many methods available to estimate communities. Most of these methods assume that the number of communities is known, which is often not the case in practice. We study a simple and very fast method for estimating the number of communities based on the spectral properties of certain graph operators, such as the non-backtracking matrix and the Bethe Hessian matrix. We show that the method performs well under several models and a wide range of parameters, and is guaranteed to be consistent under several asymptotic regimes. We compare this method to several existing methods for estimating the number of communities and show that it is both more accurate and more computationally efficient.
△ Less
Submitted 14 November, 2019; v1 submitted 3 July, 2015;
originally announced July 2015.
-
Detecting Overlapping Communities in Networks Using Spectral Methods
Authors:
Yuan Zhang,
Elizaveta Levina,
Ji Zhu
Abstract:
Community detection is a fundamental problem in network analysis which is made more challenging by overlaps between communities which often occur in practice. Here we propose a general, flexible, and interpretable generative model for overlapping communities, which can be thought of as a generalization of the degree-corrected stochastic block model. We develop an efficient spectral algorithm for e…
▽ More
Community detection is a fundamental problem in network analysis which is made more challenging by overlaps between communities which often occur in practice. Here we propose a general, flexible, and interpretable generative model for overlapping communities, which can be thought of as a generalization of the degree-corrected stochastic block model. We develop an efficient spectral algorithm for estimating the community memberships, which deals with the overlaps by employing the K-medians algorithm rather than the usual K-means for clustering in the spectral domain. We show that the algorithm is asymptotically consistent when networks are not too sparse and the overlaps between communities not too large. Numerical experiments on both simulated networks and many real social networks demonstrate that our method performs very well compared to a number of benchmark methods for overlapping community detection.
△ Less
Submitted 12 March, 2015; v1 submitted 10 December, 2014;
originally announced December 2014.
-
On semidefinite relaxations for the block model
Authors:
Arash A. Amini,
Elizaveta Levina
Abstract:
The stochastic block model (SBM) is a popular tool for community detection in networks, but fitting it by maximum likelihood (MLE) involves a computationally infeasible optimization problem. We propose a new semidefinite programming (SDP) solution to the problem of fitting the SBM, derived as a relaxation of the MLE. We put ours and previously proposed SDPs in a unified framework, as relaxations o…
▽ More
The stochastic block model (SBM) is a popular tool for community detection in networks, but fitting it by maximum likelihood (MLE) involves a computationally infeasible optimization problem. We propose a new semidefinite programming (SDP) solution to the problem of fitting the SBM, derived as a relaxation of the MLE. We put ours and previously proposed SDPs in a unified framework, as relaxations of the MLE over various sub-classes of the SBM, revealing a connection to sparse PCA. Our main relaxation, which we call SDP-1, is tighter than other recently proposed SDP relaxations, and thus previously established theoretical guarantees carry over. However, we show that SDP-1 exactly recovers true communities over a wider class of SBMs than those covered by current results. In particular, the assumption of strong assortativity of the SBM, implicit in consistency conditions for previously proposed SDPs, can be relaxed to weak assortativity for our approach, thus significantly broadening the class of SBMs covered by the consistency results. We also show that strong assortativity is indeed a necessary condition for exact recovery for previously proposed SDP approaches and not an artifact of the proofs. Our analysis of SDPs is based on primal-dual witness constructions, which provides some insight into the nature of the solutions of various SDPs. We show how to combine features from SDP-1 and already available SDPs to achieve the most flexibility in terms of both assortativity and block-size constraints, as our relaxation has the tendency to produce communities of similar sizes. This tendency makes it the ideal tool for fitting network histograms, a method gaining popularity in the graphon estimation literature, as we illustrate on an example of a social networks of dolphins. We also provide empirical evidence that SDPs outperform spectral methods for fitting SBMs with a large number of blocks.
△ Less
Submitted 16 March, 2016; v1 submitted 21 June, 2014;
originally announced June 2014.
-
Optimization via Low-rank Approximation for Community Detection in Networks
Authors:
Can M. Le,
Elizaveta Levina,
Roman Vershynin
Abstract:
Community detection is one of the fundamental problems of network analysis, for which a number of methods have been proposed. Most model-based or criteria-based methods have to solve an optimization problem over a discrete set of labels to find communities, which is computationally infeasible. Some fast spectral algorithms have been proposed for specific methods or models, but only on a case-by-ca…
▽ More
Community detection is one of the fundamental problems of network analysis, for which a number of methods have been proposed. Most model-based or criteria-based methods have to solve an optimization problem over a discrete set of labels to find communities, which is computationally infeasible. Some fast spectral algorithms have been proposed for specific methods or models, but only on a case-by-case basis. Here we propose a general approach for maximizing a function of a network adjacency matrix over discrete labels by projecting the set of labels onto a subspace approximating the leading eigenvectors of the expected adjacency matrix. This projection onto a low-dimensional space makes the feasible set of labels much smaller and the optimization problem much easier. We prove a general result about this method and show how to apply it to several previously proposed community detection criteria, establishing its consistency for label estimation in each case and demonstrating the fundamental connection between spectral properties of the network and various model-based approaches to community detection. Simulations and applications to real-world data are included to demonstrate our method performs well for multiple problems over a wide range of parameters.
△ Less
Submitted 10 May, 2015; v1 submitted 31 May, 2014;
originally announced June 2014.
-
Structured functional regression models for high-dimensional spatial spectroscopy data
Authors:
Arash A. Amini,
Elizaveta Levina,
Kerby A. Shedden
Abstract:
Modeling and analysis of spectroscopy data is an active area of research with applications to chemistry and biology. This paper focuses on analyzing Raman spectra obtained from a bone fracture healing experiment, although the functional regression model for predicting a scalar response from high-dimensional tensors can be applied to any spectroscopy data. The regression model is built on a sparse…
▽ More
Modeling and analysis of spectroscopy data is an active area of research with applications to chemistry and biology. This paper focuses on analyzing Raman spectra obtained from a bone fracture healing experiment, although the functional regression model for predicting a scalar response from high-dimensional tensors can be applied to any spectroscopy data. The regression model is built on a sparse functional representation of the spectra, and accommodates multiple spatial dimensions. We apply our models to the task of predicting bone-mineral-density (BMD), an important indicator of fracture healing, from Raman spectra, in both the in vivo and ex vivo settings of the bone fracture healing experiment. To illustrate the general applicability of the method, we also use it to predict lipoprotein concentrations from spectra obtained by nuclear magnetic resonance (NMR) spectroscopy.
△ Less
Submitted 2 November, 2013;
originally announced November 2013.
-
High-dimensional Mixed Graphical Models
Authors:
Jie Cheng,
Tianxi Li,
Elizaveta Levina,
Ji Zhu
Abstract:
While graphical models for continuous data (Gaussian graphical models) and discrete data (Ising models) have been extensively studied, there is little work on graphical models linking both continuous and discrete variables (mixed data), which are common in many scientific applications. We propose a novel graphical model for mixed data, which is simple enough to be suitable for high-dimensional dat…
▽ More
While graphical models for continuous data (Gaussian graphical models) and discrete data (Ising models) have been extensively studied, there is little work on graphical models linking both continuous and discrete variables (mixed data), which are common in many scientific applications. We propose a novel graphical model for mixed data, which is simple enough to be suitable for high-dimensional data, yet flexible enough to represent all possible graph structures. We develop a computationally efficient regression-based algorithm for fitting the model by focusing on the conditional log-likelihood of each variable given the rest. The parameters have a natural group structure, and sparsity in the fitted graph is attained by incorporating a group lasso penalty, approximated by a weighted $\ell_1$ penalty for computational efficiency. We demonstrate the effectiveness of our method through an extensive simulation study and apply it to a music annotation data set (CAL500), obtaining a sparse and interpretable graphical model relating the continuous features of the audio signal to categorical variables such as genre, emotions, and usage associated with particular songs. While we focus on binary discrete variables, we also show that the proposed methodology can be easily extended to general discrete variables.
△ Less
Submitted 19 August, 2016; v1 submitted 9 April, 2013;
originally announced April 2013.
-
Link prediction for partially observed networks
Authors:
Yunpeng Zhao,
Elizaveta Levina,
Ji Zhu
Abstract:
Link prediction is one of the fundamental problems in network analysis. In many applications, notably in genetics, a partially observed network may not contain any negative examples of absent edges, which creates a difficulty for many existing supervised learning approaches. We develop a new method which treats the observed network as a sample of the true network with different sampling rates for…
▽ More
Link prediction is one of the fundamental problems in network analysis. In many applications, notably in genetics, a partially observed network may not contain any negative examples of absent edges, which creates a difficulty for many existing supervised learning approaches. We develop a new method which treats the observed network as a sample of the true network with different sampling rates for positive and negative examples. We obtain a relative ranking of potential links by their probabilities, utilizing information on node covariates as well as on network topology. Empirically, the method performs well under many settings, including when the observed network is sparse. We apply the method to a protein-protein interaction network and a school friendship network.
△ Less
Submitted 29 January, 2013;
originally announced January 2013.
-
Sparse Ising Models with Covariates
Authors:
Jie Cheng,
Elizaveta Levina,
Pei Wang,
Ji Zhu
Abstract:
There has been a lot of work fitting Ising models to multivariate binary data in order to understand the conditional dependency relationships between the variables. However, additional covariates are frequently recorded together with the binary data, and may influence the dependence relationships. Motivated by such a dataset on genomic instability collected from tumor samples of several types, we…
▽ More
There has been a lot of work fitting Ising models to multivariate binary data in order to understand the conditional dependency relationships between the variables. However, additional covariates are frequently recorded together with the binary data, and may influence the dependence relationships. Motivated by such a dataset on genomic instability collected from tumor samples of several types, we propose a sparse covariate dependent Ising model to study both the conditional dependency within the binary data and its relationship with the additional covariates. This results in subject-specific Ising models, where the subject's covariates influence the strength of association between the genes. As in all exploratory data analysis, interpretability of results is important, and we use L1 penalties to induce sparsity in the fitted graphs and in the number of selected covariates. Two algorithms to fit the model are proposed and compared on a set of simulated data, and asymptotic results are established. The results on the tumor dataset and their biological significance are discussed in detail.
△ Less
Submitted 27 September, 2012;
originally announced September 2012.
-
Pseudo-likelihood methods for community detection in large sparse networks
Authors:
Arash A. Amini,
Aiyou Chen,
Peter J. Bickel,
Elizaveta Levina
Abstract:
Many algorithms have been proposed for fitting network models with communities, but most of them do not scale well to large networks, and often fail on sparse networks. Here we propose a new fast pseudo-likelihood method for fitting the stochastic block model for networks, as well as a variant that allows for an arbitrary degree distribution by conditioning on degrees. We show that the algorithms…
▽ More
Many algorithms have been proposed for fitting network models with communities, but most of them do not scale well to large networks, and often fail on sparse networks. Here we propose a new fast pseudo-likelihood method for fitting the stochastic block model for networks, as well as a variant that allows for an arbitrary degree distribution by conditioning on degrees. We show that the algorithms perform well under a range of settings, including on very sparse networks, and illustrate on the example of a network of political blogs. We also propose spectral clustering with perturbations, a method of independent interest, which works well on sparse networks where regular spectral clustering fails, and use it to provide an initial value for pseudo-likelihood. We prove that pseudo-likelihood provides consistent estimates of the communities under a mild condition on the starting value, for the case of a block model with two communities.
△ Less
Submitted 5 November, 2013; v1 submitted 10 July, 2012;
originally announced July 2012.
-
Community extraction for social networks
Authors:
Yunpeng Zhao,
Elizaveta Levina,
Ji Zhu
Abstract:
Analysis of networks and in particular discovering communities within networks has been a focus of recent work in several fields, with applications ranging from citation and friendship networks to food webs and gene regulatory networks. Most of the existing community detection methods focus on partitioning the entire network into communities, with the expectation of many ties within communities an…
▽ More
Analysis of networks and in particular discovering communities within networks has been a focus of recent work in several fields, with applications ranging from citation and friendship networks to food webs and gene regulatory networks. Most of the existing community detection methods focus on partitioning the entire network into communities, with the expectation of many ties within communities and few ties between. However, many networks contain nodes that do not fit in with any of the communities, and forcing every node into a community can distort results. Here we propose a new framework that focuses on community extraction instead of partition, extracting one community at a time. The main idea behind extraction is that the strength of a community should not depend on ties between members of other communities, but only on ties within that community and its ties to the outside world. We show that the new extraction criterion performs well on simulated and real networks, and establish asymptotic consistency of our method under the block model assumption.
△ Less
Submitted 18 May, 2010;
originally announced May 2010.
-
A new approach to Cholesky-based covariance regularization in high dimensions
Authors:
Adam J. Rothman,
Elizaveta Levina,
Ji Zhu
Abstract:
In this paper we propose a new regression interpretation of the Cholesky factor of the covariance matrix, as opposed to the well known regression interpretation of the Cholesky factor of the inverse covariance, which leads to a new class of regularized covariance estimators suitable for high-dimensional problems. Regularizing the Cholesky factor of the covariance via this regression interpretati…
▽ More
In this paper we propose a new regression interpretation of the Cholesky factor of the covariance matrix, as opposed to the well known regression interpretation of the Cholesky factor of the inverse covariance, which leads to a new class of regularized covariance estimators suitable for high-dimensional problems. Regularizing the Cholesky factor of the covariance via this regression interpretation always results in a positive definite estimator. In particular, one can obtain a positive definite banded estimator of the covariance matrix at the same computational cost as the popular banded estimator proposed by Bickel and Levina (2008b), which is not guaranteed to be positive definite. We also establish theoretical connections between banding Cholesky factors of the covariance matrix and its inverse and constrained maximum likelihood estimation under the banding constraint, and compare the numerical performance of several methods in simulations and on a sonar data example.
△ Less
Submitted 3 March, 2009;
originally announced March 2009.
-
Sparse estimation of large covariance matrices via a nested Lasso penalty
Authors:
Elizaveta Levina,
Adam Rothman,
Ji Zhu
Abstract:
The paper proposes a new covariance estimator for large covariance matrices when the variables have a natural ordering. Using the Cholesky decomposition of the inverse, we impose a banded structure on the Cholesky factor, and select the bandwidth adaptively for each row of the Cholesky factor, using a novel penalty we call nested Lasso. This structure has more flexibility than regular banding, b…
▽ More
The paper proposes a new covariance estimator for large covariance matrices when the variables have a natural ordering. Using the Cholesky decomposition of the inverse, we impose a banded structure on the Cholesky factor, and select the bandwidth adaptively for each row of the Cholesky factor, using a novel penalty we call nested Lasso. This structure has more flexibility than regular banding, but, unlike regular Lasso applied to the entries of the Cholesky factor, results in a sparse estimator for the inverse of the covariance matrix. An iterative algorithm for solving the optimization problem is developed. The estimator is compared to a number of other covariance estimators and is shown to do best, both in simulations and on a real data example. Simulations show that the margin by which the estimator outperforms its competitors tends to increase with dimension.
△ Less
Submitted 27 March, 2008;
originally announced March 2008.