-
Stein-like Estimators for Causal Mediation Analysis in Randomized Trials
Authors:
Cedric E. Ginestet,
Richard Emsley,
Sabine Landau
Abstract:
Causal mediation analysis aims to estimate the natural direct and indirect effects under clearly specified assumptions. Traditional mediation analysis based on Ordinary Least Squares (OLS) relies on the absence of unmeasured causes of the putative mediator and outcome. When this assumption cannot be justified, Instrumental Variables (IV) estimators can be used in order to produce an asymptotically…
▽ More
Causal mediation analysis aims to estimate the natural direct and indirect effects under clearly specified assumptions. Traditional mediation analysis based on Ordinary Least Squares (OLS) relies on the absence of unmeasured causes of the putative mediator and outcome. When this assumption cannot be justified, Instrumental Variables (IV) estimators can be used in order to produce an asymptotically unbiased estimator of the mediator-outcome link. However, provided that valid instruments exist, bias removal comes at the cost of variance inflation for standard IV procedures such as Two-Stage Least Squares (TSLS). A Semi-Parametric Stein-Like (SPSL) estimator has been proposed in the literature that strikes a natural trade-off between the unbiasedness of the TSLS procedure and the relatively small variance of the OLS estimator. Moreover, the SPSL has the advantage that its shrinkage parameter can be directly estimated from the data. In this paper, we demonstrate how this Stein-like estimator can be implemented in the context of the estimation of natural direct and natural indirect effects of treatments in randomized controlled trials. The performance of the competing methods is studied in a simulation study, in which both the strength of hidden confounding and the strength of the instruments are independently varied. These considerations are motivated by a trial in mental health evaluating the impact of a primary care-based intervention to reduce depression in the elderly.
△ Less
Submitted 6 July, 2017;
originally announced July 2017.
-
Convex Combination of Ordinary Least Squares and Two-stage Least Squares Estimators
Authors:
Cedric E. Ginestet,
Richard Emsley,
Sabine Landau
Abstract:
In the presence of confounders, the ordinary least squares (OLS) estimator is known to be biased. This problem can be remedied by using the two-stage least squares (TSLS) estimator, based on the availability of valid instrumental variables (IVs). This reduction in bias, however, is offset by an increase in variance. Under standard assumptions, the OLS has indeed a larger bias than the TSLS estimat…
▽ More
In the presence of confounders, the ordinary least squares (OLS) estimator is known to be biased. This problem can be remedied by using the two-stage least squares (TSLS) estimator, based on the availability of valid instrumental variables (IVs). This reduction in bias, however, is offset by an increase in variance. Under standard assumptions, the OLS has indeed a larger bias than the TSLS estimator; and moreover, one can prove that the sample variance of the OLS estimator is no greater than the one of the TSLS. Therefore, it is natural to ask whether one could combine the desirable properties of the OLS and TSLS estimators. Such a trade-off can be achieved through a convex combination of these two estimators, thereby producing our proposed convex least squares (CLS) estimator. The relative contribution of the OLS and TSLS estimators is here chosen to minimize a sample estimate of the mean squared error (MSE) of their convex combination. This proportion parameter is proved to be unique, whenever the OLS and TSLS differ in MSEs. Remarkably, we show that this proportion parameter can be estimated from the data, and that the resulting CLS estimator is consistent. We also show how the CLS framework can incorporate other asymptotically unbiased estimators, such as the jackknife IV estimator (JIVE). The finite-sample properties of the CLS estimator are investigated using Monte Carlo simulations, in which we independently vary the amount of confounding and the strength of the instrument. Overall, the CLS estimator is found to outperform the TSLS estimator in terms of MSE. The method is also applied to a classic data set from econometrics, which models the financial return to education.
△ Less
Submitted 13 April, 2015;
originally announced April 2015.
-
Hypothesis Testing For Network Data in Functional Neuroimaging
Authors:
Cedric E. Ginestet,
Jun Li,
Prakash Balachandran,
Steven Rosenberg,
Eric D. Kolaczyk
Abstract:
In recent years, it has become common practice in neuroscience to use networks to summarize relational information in a set of measurements, typically assumed to be reflective of either functional or structural relationships between regions of interest in the brain. One of the most basic tasks of interest in the analysis of such data is the testing of hypotheses, in answer to questions such as "Is…
▽ More
In recent years, it has become common practice in neuroscience to use networks to summarize relational information in a set of measurements, typically assumed to be reflective of either functional or structural relationships between regions of interest in the brain. One of the most basic tasks of interest in the analysis of such data is the testing of hypotheses, in answer to questions such as "Is there a difference between the networks of these two groups of subjects?" In the classical setting, where the unit of interest is a scalar or a vector, such questions are answered through the use of familiar two-sample testing strategies. Networks, however, are not Euclidean objects, and hence classical methods do not directly apply. We address this challenge by drawing on concepts and techniques from geometry, and high-dimensional statistical inference. Our work is based on a precise geometric characterization of the space of graph Laplacian matrices and a nonparametric notion of averaging due to Fréchet. We motivate and illustrate our resulting methodologies for testing in the context of networks derived from functional neuroimaging data on human subjects from the 1000 Functional Connectomes Project. In particular, we show that this global test is more statistical powerful, than a mass-univariate approach. In addition, we have also provided a method for visualizing the individual contribution of each edge to the overall test statistic.
△ Less
Submitted 17 March, 2017; v1 submitted 21 July, 2014;
originally announced July 2014.
-
Percolation under Noise: Detecting Explosive Percolation Using the Second Largest Component
Authors:
Wes Viles,
Cedric E. Ginestet,
Ariana Tang,
Mark A. Kramer,
Eric D. Kolaczyk
Abstract:
We consider the problem of distinguishing classical (Erdős-Rényi) percolation from explosive (Achlioptas) percolation, under noise. A statistical model of percolation is constructed allowing for the birth and death of edges as well as the presence of noise in the observations. This graph-valued stochastic process is composed of a latent and an observed non-stationary process, where the observed gr…
▽ More
We consider the problem of distinguishing classical (Erdős-Rényi) percolation from explosive (Achlioptas) percolation, under noise. A statistical model of percolation is constructed allowing for the birth and death of edges as well as the presence of noise in the observations. This graph-valued stochastic process is composed of a latent and an observed non-stationary process, where the observed graph process is corrupted by Type I and Type II errors. This produces a hidden Markov graph model. We show that for certain choices of parameters controlling the noise, the classical (ER) percolation is visually indistinguishable from the explosive (Achlioptas) percolation model. In this setting, we compare two different criteria for discriminating between these two percolation models, based on a quantile difference (QD) of the first component's size and on the maximal size of the second largest component. We show through data simulations that this second criterion outperforms the QD of the first component's size, in terms of discriminatory power. The maximal size of the second component therefore provides a useful statistic for distinguishing between the ER and Achlioptas models of percolation, under physically motivated conditions for the birth and death of edges, and under noise. The potential application of the proposed criteria for percolation detection in clinical neuroscience is also discussed.
△ Less
Submitted 15 January, 2014;
originally announced January 2014.
-
Statistical Network Analysis for Functional MRI: Summary Networks and Group Comparisons
Authors:
Cedric E. Ginestet,
Arnaud P. Fournel,
Andrew Simmons
Abstract:
Comparing weighted networks in neuroscience is hard, because the topological properties of a given network are necessarily dependent on the number of edges of that network. This problem arises in the analysis of both weighted and unweighted networks. The term density is often used in this context, in order to refer to the mean edge weight of a weighted network, or to the number of edges in an unwe…
▽ More
Comparing weighted networks in neuroscience is hard, because the topological properties of a given network are necessarily dependent on the number of edges of that network. This problem arises in the analysis of both weighted and unweighted networks. The term density is often used in this context, in order to refer to the mean edge weight of a weighted network, or to the number of edges in an unweighted one. Comparing families of networks is therefore statistically difficult because differences in topology are necessarily associated with differences in density. In this review paper, we consider this problem from two different perspectives, which include (i) the construction of summary networks, such as how to compute and visualize the mean network from a sample of network-valued data points; and (ii) how to test for topological differences, when two families of networks also exhibit significant differences in density. In the first instance, we show that the issue of summarizing a family of networks can be conducted by adopting a mass-univariate approach, which produces a statistical parametric network (SPN). In the second part of this review, we then highlight the inherent problems associated with the comparison of topological functions of families of networks that differ in density. In particular, we show that a wide range of topological summaries, such as global efficiency and network modularity are highly sensitive to differences in density. Moreover, these problems are not restricted to unweighted metrics, as we demonstrate that the same issues remain present when considering the weighted versions of these metrics. We conclude by encouraging caution, when reporting such statistical comparisons, and by emphasizing the importance of constructing summary networks.
△ Less
Submitted 27 March, 2014; v1 submitted 12 August, 2013;
originally announced August 2013.
-
Group Analysis of Self-organizing Maps based on Functional MRI using Restricted Frechet Means
Authors:
Arnaud P. Fournel,
Emanuelle Reynaud,
Michael J. Brammer,
Andrew Simmons,
Cedric E. Ginestet
Abstract:
Studies of functional MRI data are increasingly concerned with the estimation of differences in spatio-temporal networks across groups of subjects or experimental conditions. Unsupervised clustering and independent component analysis (ICA) have been used to identify such spatio-temporal networks. While these approaches have been useful for estimating these networks at the subject-level, comparison…
▽ More
Studies of functional MRI data are increasingly concerned with the estimation of differences in spatio-temporal networks across groups of subjects or experimental conditions. Unsupervised clustering and independent component analysis (ICA) have been used to identify such spatio-temporal networks. While these approaches have been useful for estimating these networks at the subject-level, comparisons over groups or experimental conditions require further methodological development. In this paper, we tackle this problem by showing how self-organizing maps (SOMs) can be compared within a Frechean inferential framework. Here, we summarize the mean SOM in each group as a Frechet mean with respect to a metric on the space of SOMs. We consider the use of different metrics, and introduce two extensions of the classical sum of minimum distance (SMD) between two SOMs, which take into account the spatio-temporal pattern of the fMRI data. The validity of these methods is illustrated on synthetic data. Through these simulations, we show that the three metrics of interest behave as expected, in the sense that the ones capturing temporal, spatial and spatio-temporal aspects of the SOMs are more likely to reach significance under simulated scenarios characterized by temporal, spatial and spatio-temporal differences, respectively. In addition, a re-analysis of a classical experiment on visually-triggered emotions demonstrates the usefulness of this methodology. In this study, the multivariate functional patterns typical of the subjects exposed to pleasant and unpleasant stimuli are found to be more similar than the ones of the subjects exposed to emotionally neutral stimuli. Taken together, these results indicate that our proposed methods can cast new light on existing data by adopting a global analytical perspective on functional MRI paradigms.
△ Less
Submitted 13 August, 2012; v1 submitted 28 May, 2012;
originally announced May 2012.
-
Strong Consistency of Frechet Sample Mean Sets for Graph-Valued Random Variables
Authors:
Cedric E. Ginestet
Abstract:
The Frechet mean or barycenter generalizes the idea of averaging in spaces where pairwise addition is not well-defined. In general metric spaces, the Frechet sample mean is not a consistent estimator of the theoretical Frechet mean. For graph-valued random variables, for instance, the Frechet sample mean may fail to converge to a unique value. Hence, it becomes necessary to consider the convergenc…
▽ More
The Frechet mean or barycenter generalizes the idea of averaging in spaces where pairwise addition is not well-defined. In general metric spaces, the Frechet sample mean is not a consistent estimator of the theoretical Frechet mean. For graph-valued random variables, for instance, the Frechet sample mean may fail to converge to a unique value. Hence, it becomes necessary to consider the convergence of sequences of sets of graphs. We show that a specific type of almost sure convergence for the Frechet sample mean previously introduced by Ziezold (1977) is, in fact, equivalent to the Kuratowski outer limit of a sequence of Frechet sample means. Equipped with this outer limit, we provide a new proof of the strong consistency of the Frechet sample mean for graph-valued random variables in separable (pseudo-)metric space. Our proof strategy exploits the fact that the metric of interest is bounded, since we are considering graphs over a finite number of vertices. In this setting, we describe two strong laws of large numbers for both the restricted and unrestricted Frechet sample means of all orders, thereby generalizing a previous result, due to Sverdrup-Thygeson (1981).
△ Less
Submitted 15 May, 2013; v1 submitted 14 April, 2012;
originally announced April 2012.
-
Topological Randomness and Number of Edges Predict Modular Structure in Functional Brain Networks
Authors:
Cedric E. Ginestet,
Jonny O'Muircheartaigh,
Owen G. O'Daly,
Andrew Simmons
Abstract:
In a recent paper, Bassett et al. (2011) have analyzed the static and dynamic organization of functional brain networks in humans. We here focus on the first claim made in this paper, which states that the static modular structure of such networks is nested with respect to time. Bassett et al. (2011) argue that this graded structure underlines a "multiscale modular structure". In this letter, howe…
▽ More
In a recent paper, Bassett et al. (2011) have analyzed the static and dynamic organization of functional brain networks in humans. We here focus on the first claim made in this paper, which states that the static modular structure of such networks is nested with respect to time. Bassett et al. (2011) argue that this graded structure underlines a "multiscale modular structure". In this letter, however, we show that such a relationship is substantially mediated by an increase in the random variation of the correlation coefficients computed at different time scales.
△ Less
Submitted 29 June, 2011;
originally announced June 2011.
-
Classification Loss Function for Parameter Ensembles in Bayesian Hierarchical Models
Authors:
Cedric E. Ginestet,
Nicky G. Best,
Sylvia Richardson
Abstract:
Parameter ensembles or sets of point estimates constitute one of the cornerstones of modern statistical practice. This is especially the case in Bayesian hierarchical models, where different decision-theoretic frameworks can be deployed to summarize such parameter ensembles. The estimation of these parameter ensembles may thus substantially vary depending on which inferential goals are prioritised…
▽ More
Parameter ensembles or sets of point estimates constitute one of the cornerstones of modern statistical practice. This is especially the case in Bayesian hierarchical models, where different decision-theoretic frameworks can be deployed to summarize such parameter ensembles. The estimation of these parameter ensembles may thus substantially vary depending on which inferential goals are prioritised by the modeller. In this note, we consider the problem of classifying the elements of a parameter ensemble above or below a given threshold. Two threshold classification losses (TCLs) --weighted and unweighted-- are formulated. The weighted TCL can be used to emphasize the estimation of false positives over false negatives or the converse. We prove that the weighted and unweighted TCLs are optimized by the ensembles of unit-specific posterior quantiles and posterior medians, respectively. In addition, we relate these classification loss functions on parameter ensembles to the concepts of posterior sensitivity and specificity. Finally, we find some relationships between the unweighted TCL and the absolute value loss, which explain why both functions are minimized by posterior medians.
△ Less
Submitted 9 June, 2011; v1 submitted 31 May, 2011;
originally announced May 2011.
-
Bayesian Decision-theoretic Methods for Parameter Ensembles with Application to Epidemiology
Authors:
Cedric E. Ginestet
Abstract:
Parameter ensembles or sets of random effects constitute one of the cornerstones of modern statistical practice. This is especially the case in Bayesian hierarchical models, where several decision theoretic frameworks can be deployed. The estimation of these parameter ensembles may substantially vary depending on which inferential goals are prioritised by the modeller. Since one may wish to satisf…
▽ More
Parameter ensembles or sets of random effects constitute one of the cornerstones of modern statistical practice. This is especially the case in Bayesian hierarchical models, where several decision theoretic frameworks can be deployed. The estimation of these parameter ensembles may substantially vary depending on which inferential goals are prioritised by the modeller. Since one may wish to satisfy a range of desiderata, it is therefore of interest to investigate whether some sets of point estimates can simultaneously meet several inferential objectives. In this thesis, we will be especially concerned with identifying ensembles of point estimates that produce good approximations of (i) the true empirical quantiles and empirical quartile ratio (QR) and (ii) provide an accurate classification of the ensemble's elements above and below a given threshold. For this purpose, we review various decision-theoretic frameworks, which have been proposed in the literature in relation to the optimisation of different aspects of the empirical distribution of a parameter ensemble. This includes the constrained Bayes (CB), weighted-rank squared error loss (WRSEL), and triple-goal (GR) ensembles of point estimates. In addition, we also consider the set of maximum likelihood estimates (MLEs) and the ensemble of posterior means --the latter being optimal under the summed squared error loss (SSEL). Firstly, we test the performance of these different sets of point estimates as plug-in estimators for the empirical quantiles and empirical QR under a range of synthetic scenarios encompassing both spatial and non-spatial simulated data sets. Performance evaluation is here conducted using the posterior regret. Secondly, two threshold classification losses (TCLs) --weighted and unweighted-- are formulated and formally optimised. The performance of these decision-theoretic tools is also evaluated on real data sets.
△ Less
Submitted 18 March, 2014; v1 submitted 25 May, 2011;
originally announced May 2011.
-
Brain Network Analysis: Separating Cost from Topology using Cost-integration
Authors:
Cedric E. Ginestet,
Thomas E. Nichols,
Ed T. Bullmore,
Andrew Simmons
Abstract:
A statistically principled way of conducting weighted network analysis is still lacking. Comparison of different populations of weighted networks is hard because topology is inherently dependent on wiring cost, where cost is defined as the number of edges in an unweighted graph. In this paper, we evaluate the benefits and limitations associated with using cost-integrated topological metrics. Our f…
▽ More
A statistically principled way of conducting weighted network analysis is still lacking. Comparison of different populations of weighted networks is hard because topology is inherently dependent on wiring cost, where cost is defined as the number of edges in an unweighted graph. In this paper, we evaluate the benefits and limitations associated with using cost-integrated topological metrics. Our focus is on comparing populations of weighted undirected graphs using global efficiency. We evaluate different approaches to the comparison of weighted networks that differ in mean association weight. Our key result shows that integrating over cost is equivalent to controlling for any monotonic transformation of the weight set of a weighted graph. That is, when integrating over cost, we eliminate the differences in topology that may be due to a monotonic transformation of the weight set. Our result holds for any unweighted topological measure. Cost-integration is therefore helpful in disentangling differences in cost from differences in topology. By contrast, we show that the use of the weighted version of a topological metric does not constitute a valid approach to this problem. Indeed, we prove that, under mild conditions, the use of the weighted version of global efficiency is equivalent to simply comparing weighted costs. Thus, we recommend the reporting of (i) differences in weighted costs and (ii) differences in cost-integrated topological measures. We demonstrate the application of these techniques in a re-analysis of an fMRI working memory task. Finally, we discuss the limitations of integrating topology over cost, which may pose problems when some weights are zero, when multiplicities exist in the ranks of the weights, and when one expects subtle cost-dependent topological differences, which could be masked by cost-integration.
△ Less
Submitted 9 June, 2011; v1 submitted 19 April, 2011;
originally announced April 2011.
-
Recursive Shortest Path Algorithm with Application to Density-integration of Weighted Graphs
Authors:
Cedric E. Ginestet,
Andrew Simmons
Abstract:
Graph theory is increasingly commonly utilised in genetics, proteomics and neuroimaging. In such fields, the data of interest generally constitute weighted graphs. Analysis of such weighted graphs often require the integration of topological metrics with respect to the density of the graph. Here, density refers to the proportion of the number of edges present in that graph. When topological metric…
▽ More
Graph theory is increasingly commonly utilised in genetics, proteomics and neuroimaging. In such fields, the data of interest generally constitute weighted graphs. Analysis of such weighted graphs often require the integration of topological metrics with respect to the density of the graph. Here, density refers to the proportion of the number of edges present in that graph. When topological metrics based on shortest paths are of interest, such density-integration usually necessitates the iterative application of Dijkstra's algorithm in order to compute the shortest path matrix at each density level. In this short note, we describe a recursive shortest path algorithm based on single edge updating, which replaces the need for the iterative use of Dijkstra's algorithm. Our proposed procedure is based on pairs of breadth-first searches around each of the vertices incident to the edge added at each recursion. An algorithmic analysis of the proposed technique is provided. When the graph of interest is coded as an adjacency list, our algorithm can be shown to be more efficient than an iterative use of Dijkstra's algorithm.
△ Less
Submitted 7 April, 2011;
originally announced April 2011.