-
When Measurement Mediates the Effect of Interest
Authors:
Joy Zora Nakato,
Janice Litunya,
Brian Beesiga,
Jane Kabami,
James Ayieko,
Moses R. Kamya,
Gabriel Chamie,
Laura B. Balzer
Abstract:
Many health promotion strategies aim to improve reach into the target population and outcomes among those reached. For example, an HIV prevention strategy could expand the reach of risk screening and the delivery of biomedical prevention to persons with HIV risk. This setting creates a complex missing data problem: the strategy improves health outcomes directly and indirectly through expanded reac…
▽ More
Many health promotion strategies aim to improve reach into the target population and outcomes among those reached. For example, an HIV prevention strategy could expand the reach of risk screening and the delivery of biomedical prevention to persons with HIV risk. This setting creates a complex missing data problem: the strategy improves health outcomes directly and indirectly through expanded reach, while outcomes are only measured among those reached. To formally define the total causal effect in such settings, we use Counterfactual Strata Effects: causal estimands where the outcome is only relevant for a group whose membership is subject to missingness and/or impacted by the exposure. To identify and estimate the corresponding statistical estimand, we propose a novel extension of Two-Stage targeted minimum loss-based estimation (TMLE). Simulations demonstrate the practical performance of our approach as well as the limitations of existing approaches.
△ Less
Submitted 10 June, 2025; v1 submitted 6 June, 2025;
originally announced June 2025.
-
Causal Inference in Randomized Trials with Partial Clustering
Authors:
Joshua Nugent,
Elijah Kakande,
Gabriel Chamie,
Jane Kabami,
Asiphas Owaraganise,
Diane V. Havlir,
Moses Kamya,
Laura Balzer
Abstract:
Clustering and dependence are common in trials. For example, in some cluster randomized trials (CRTs), pre-existing clusters are enrolled, randomized, and serve as the basis of intervention delivery. Such CRTs are "fully clustered": participants are dependent within clusters. In contrast, "partially clustered" trials contain a mix of participants that are dependent within clusters and participants…
▽ More
Clustering and dependence are common in trials. For example, in some cluster randomized trials (CRTs), pre-existing clusters are enrolled, randomized, and serve as the basis of intervention delivery. Such CRTs are "fully clustered": participants are dependent within clusters. In contrast, "partially clustered" trials contain a mix of participants that are dependent within clusters and participants that are completely independent. One example of this design is a trial where participants are artificially grouped together for the purposes of randomization only; then, for intervention participants, the groups are the basis for intervention delivery, while control participants are un-grouped. Another example is an individually randomized group treatment trial (IRGTT) where participants are individually randomized and, post-randomization, intervention participants are grouped for intervention delivery, while the control participants remain un-grouped. For the three trial designs, we use causal models to non-parametrically describe the data generating process and formalize the observed data dependence structure. We show that despite the different randomization approach, both designs can be represented with the same dependence structure, enabling the use of the same statistical methods for estimation and inference of causal effects. We propose a novel implementation of targeted minimum loss-based estimation (TMLE) for these trials. TMLE is model-robust, leverages covariate adjustment and machine learning, and estimates many causal effects. In simulations, TMLE achieved comparable higher statistical power than alternatives for partially clustered designs. Finally, application to real data from the SEARCH-IPT trial resulted in 20-57% efficiency gains, demonstrating the consequences of our proposed approach.
△ Less
Submitted 8 November, 2024; v1 submitted 6 June, 2024;
originally announced June 2024.
-
Statistical Analysis Plan for the Drinkers' Intervention to Prevent Tuberculosis (DIPT Study)
Authors:
Sara Lodi,
Gabriel Chamie,
Judith Hahn
Abstract:
The Drinkers' Intervention to Prevent TB (DIPT) is a randomized controlled trial designed to determine if incentive-based approaches can reduce alcohol use and improve medication adherence to isoniazid (INH) preventive therapy in persons with HIV (PWH) co-infected with tuberculosis (TB) who engage in heavy drinking in Uganda.
This statistical analysis plan (SAP) provides a detailed descriptions…
▽ More
The Drinkers' Intervention to Prevent TB (DIPT) is a randomized controlled trial designed to determine if incentive-based approaches can reduce alcohol use and improve medication adherence to isoniazid (INH) preventive therapy in persons with HIV (PWH) co-infected with tuberculosis (TB) who engage in heavy drinking in Uganda.
This statistical analysis plan (SAP) provides a detailed descriptions of the primary and secondary outcomes in the study and the corresponding statistical analyses.
△ Less
Submitted 17 August, 2022;
originally announced August 2022.
-
Statistical Analysis Plan for Health Outcomes in Phase 1 of the SEARCH-IPT Study
Authors:
Laura B. Balzer,
Joshua Nugent,
Diane V. Havlir,
Gabriel Chamie
Abstract:
This document provides the statistical analytic plan (SAP) for evaluating health outcomes in Phase 1 of the SEARCH-IPT Study, a cluster randomized trial to evaluate whether a multicomponent intervention increases uptake of isoniazid (INH) preventive therapy (IPT) and reduces the incidence of tuberculosis (TB) in Uganda (Clinicaltrials.gov: NCT03315962). The SAP was locked prior to unblinding and e…
▽ More
This document provides the statistical analytic plan (SAP) for evaluating health outcomes in Phase 1 of the SEARCH-IPT Study, a cluster randomized trial to evaluate whether a multicomponent intervention increases uptake of isoniazid (INH) preventive therapy (IPT) and reduces the incidence of tuberculosis (TB) in Uganda (Clinicaltrials.gov: NCT03315962). The SAP was locked prior to unblinding and effect estimation. This SAP was embargoed until November 19, 2021 when it was submitted to arXiv.
△ Less
Submitted 19 November, 2021;
originally announced November 2021.
-
Two-Stage TMLE to Reduce Bias and Improve Efficiency in Cluster Randomized Trials
Authors:
Laura B. Balzer,
Mark van der Laan,
James Ayieko,
Moses Kamya,
Gabriel Chamie,
Joshua Schwab,
Diane V. Havlir,
Maya L. Petersen
Abstract:
Cluster randomized trials (CRTs) randomly assign an intervention to groups of individuals (e.g., clinics or communities) and measure outcomes on individuals in those groups. While offering many advantages, this experimental design introduces challenges that are only partially addressed by existing analytic approaches. First, outcomes are often missing for some individuals within clusters. Failing…
▽ More
Cluster randomized trials (CRTs) randomly assign an intervention to groups of individuals (e.g., clinics or communities) and measure outcomes on individuals in those groups. While offering many advantages, this experimental design introduces challenges that are only partially addressed by existing analytic approaches. First, outcomes are often missing for some individuals within clusters. Failing to appropriately adjust for differential outcome measurement can result in biased estimates and inference. Second, CRTs often randomize limited numbers of clusters, resulting in chance imbalances on baseline outcome predictors between arms. Failing to adaptively adjust for these imbalances and other predictive covariates can result in efficiency losses. To address these methodological gaps, we propose and evaluate a novel two-stage targeted minimum loss-based estimator (TMLE) to adjust for baseline covariates in a manner that optimizes precision, after controlling for baseline and post-baseline causes of missing outcomes. Finite sample simulations illustrate that our approach can nearly eliminate bias due to differential outcome measurement, while existing CRT estimators yield misleading results and inferences. Application to real data from the SEARCH community randomized trial demonstrates the gains in efficiency afforded through adaptive adjustment for baseline covariates, after controlling for missingness on individual-level outcomes.
△ Less
Submitted 20 October, 2021; v1 submitted 29 June, 2021;
originally announced June 2021.
-
Semi-Supervised Record Linkage for Construction of Large-Scale Sociocentric Networks in Resource-limited Settings: An application to the SEARCH Study in Rural Uganda and Kenya
Authors:
Yiqun Chen,
Wenjing Zheng,
Lillian B. Brown,
Gabriel Chamie,
Dalsone Kwarisiima,
Jane Kabami,
Tamara D. Clark,
Norton Sang,
James Ayieko,
Edwin D. Charlebois,
Vivek Jain,
Laura Balzer,
Moses R Kamya,
Diane Havlir,
Maya Petersen,
the SEARCH Collaboration
Abstract:
This paper presents a novel semi-supervised algorithmic approach to creating large scale sociocentric networks in rural East Africa. We describe the construction of 32 large-scale sociocentric social networks in rural Sub-Saharan Africa. Networks were constructed by applying a semi-supervised record-linkage algorithm to data from census-enumerated residents of the 32 communities included in the SE…
▽ More
This paper presents a novel semi-supervised algorithmic approach to creating large scale sociocentric networks in rural East Africa. We describe the construction of 32 large-scale sociocentric social networks in rural Sub-Saharan Africa. Networks were constructed by applying a semi-supervised record-linkage algorithm to data from census-enumerated residents of the 32 communities included in the SEARCH study (NCT01864603), a community-cluster randomized HIV prevention trial in Uganda and Kenya. Contacts were solicited using a five question name generator in the domains of emotional support, food sharing, free time, health issues and money issues. The fully constructed networks include 170; 028 nodes and 362; 965 edges aggregated across communities (ranging from 4449 to 6829 nodes and from 2349 to 31,779 edges per community). Our algorithm matched on average 30% of named contacts in Kenyan communities and 50% of named contacts in Ugandan communities to residents named in census enumeration. Assortative mixing measures for eight different covariates reveal that residents in the network have a very strong tendency to associate with others who are similar to them in age, sex, and especially village. The networks in the SEARCH Study will provide a platform for improved understanding of health outcomes in rural East Africa. The network construction algorithm we present may facilitate future social network research in resource-limited settings.
△ Less
Submitted 23 August, 2019;
originally announced August 2019.