-
Modeling Homophily in Dynamic Networks with Application to HIV Molecular Surveillance
Authors:
V. DeGruttola,
M. Nakazawa,
J. Liu,
X. Tu,
S. Little,
S. Mehta
Abstract:
This paper describes a novel approach to modeling homphily, i.e. the tendency of nodes that share (or differ in) certain attributes to be linked; we consider dynamic networks in which nodes can be added over time but not removed. Our application is to HIV genetic linkage analysis that has been used to investigate HIV transmission dynamics. In this setting, two HIV sequences from different persons…
▽ More
This paper describes a novel approach to modeling homphily, i.e. the tendency of nodes that share (or differ in) certain attributes to be linked; we consider dynamic networks in which nodes can be added over time but not removed. Our application is to HIV genetic linkage analysis that has been used to investigate HIV transmission dynamics. In this setting, two HIV sequences from different persons with HIV (PWH) are said to be linked if the genetic distance between these sequences is less than a given threshold. Such linkage suggests that that the nodes representing the two infected PWH, are close to each other in a transmission network; such proximity would imply that either one of the infected people directly transmitted the virus to the other or indirectly transmitted it through a small number of intermediaries. These viral genetic linkage networks are dynamic in the sense that, over time, a group or cluster of genetically linked viral sequences may increase in size as new people are infected by those in the cluster either directly or through intermediaries. Our approach makes use of a logistic model to describe homophily with regard to demographic and behavioral characteristics that is we investigate whether similarities (or differences) between PWH in these characteristics impacts the probability that their sequences are be linked. Such analyses provide information about HIV transmission dynamics within a population.
△ Less
Submitted 29 December, 2021;
originally announced January 2022.
-
Bayesian method for inferring the impact of geographical distance on intensity of communication
Authors:
Fei Li,
Jukka-Pekka Onnela,
Victor DeGruttola
Abstract:
Both theoretical models and empirical findings suggest that the intensity of communication among groups of people declines with their degree of geographical separation. There is some evidence that rather than decaying uniformly with distance, the intensity of communication might decline at different rates for shorter and longer distances. Using Bayesian LASSO for model selection, we introduce a st…
▽ More
Both theoretical models and empirical findings suggest that the intensity of communication among groups of people declines with their degree of geographical separation. There is some evidence that rather than decaying uniformly with distance, the intensity of communication might decline at different rates for shorter and longer distances. Using Bayesian LASSO for model selection, we introduce a statistical model for estimating the rate of communication decline with geographic distance that allows for discontinuities in this rate. We apply our method to an anonymized mobile phone communication dataset. Our results are potentially useful in settings where understanding social and spatial mixing of people is important, such as in cluster randomized trials design.
△ Less
Submitted 23 May, 2018;
originally announced May 2018.
-
Accounting for interactions and complex inter-subject dependency in estimating treatment effect in cluster randomized trials with missing outcomes
Authors:
Melanie Prague,
Rui Wang,
Alisa Stephens,
Eric Tchetgen Tchetgen,
Victor DeGruttola
Abstract:
Semi-parametric methods are often used for the estimation of intervention effects on correlated outcomes in cluster-randomized trials (CRTs). When outcomes are missing at random (MAR), Inverse Probability Weighted (IPW) methods incorporating baseline covariates can be used to deal with informative missingness. Also, augmented generalized estimating equations (AUG) correct for imbalance in baseline…
▽ More
Semi-parametric methods are often used for the estimation of intervention effects on correlated outcomes in cluster-randomized trials (CRTs). When outcomes are missing at random (MAR), Inverse Probability Weighted (IPW) methods incorporating baseline covariates can be used to deal with informative missingness. Also, augmented generalized estimating equations (AUG) correct for imbalance in baseline covariates but need to be extended for MAR outcomes. However, in the presence of interactions between treatment and baseline covariates, neither method alone produces consistent estimates for the marginal treatment effect if the model for interaction is not correctly specified. We propose an AUG-IPW estimator that weights by the inverse of the probability of being a complete case and allows different outcome models in each intervention arm. This estimator is doubly robust (DR), it gives correct estimates whether the missing data process or the outcome model is correctly specified. We consider the problem of covariate interference which arises when the outcome of an individual may depend on covariates of other individuals. When interfering covariates are not modeled, the DR property prevents bias as long as covariate interference is not present simultaneously for the outcome and the missingness. An R package is developed implementing the proposed method. An extensive simulation study and an application to a CRT of HIV risk reduction-intervention in South Africa illustrate the method.
△ Less
Submitted 26 January, 2016; v1 submitted 7 July, 2015;
originally announced July 2015.