-
Estimating HIV Cross-sectional Incidence Using Recency Tests from a Non-representative Sample
Authors:
Jianan Pan,
Marlena Bannick,
Fei Gao
Abstract:
Cross-sectional incidence estimation based on recency testing has become a widely used tool in HIV research. Recently, this method has gained prominence in HIV prevention trials to estimate the "placebo" incidence that participants might experience without preventive treatment. The application of this approach faces challenges due to non-representative sampling, as individuals aware of their HIV-p…
▽ More
Cross-sectional incidence estimation based on recency testing has become a widely used tool in HIV research. Recently, this method has gained prominence in HIV prevention trials to estimate the "placebo" incidence that participants might experience without preventive treatment. The application of this approach faces challenges due to non-representative sampling, as individuals aware of their HIV-positive status may be less likely to participate in screening for an HIV prevention trial. To address this, a recent phase 3 trial excluded individuals based on whether they have had a recent HIV test. To the best of our knowledge, the validity of this approach has yet to be studied. In our work, we investigate the performance of cross-sectional HIV incidence estimation when excluding individuals based on prior HIV tests in realistic trial settings. We develop a statistical framework that incorporates a testing-based criterion and possible non-representative sampling. We introduce a metric we call the effective mean duration of recent infection (MDRI) that mathematically quantifies bias in incidence estimation. We conduct an extensive simulation study to evaluate incidence estimator performance under various scenarios. Our findings reveal that when screening attendance is affected by knowledge of HIV status, incidence estimators become unreliable unless all individuals with recent HIV tests are excluded. Additionally, we identified a trade-off between bias and variability: excluding more individuals reduces bias from non-representative sampling but in many cases increases the variability of incidence estimates. These findings highlight the need for caution when applying testing-based criteria and emphasize the importance of refining incidence estimation methods to improve the design and evaluation of future HIV prevention trials.
△ Less
Submitted 16 December, 2024;
originally announced December 2024.
-
Robust Nonparametric Stochastic Frontier Analysis
Authors:
Peng Zheng,
Nahom Worku,
Marlena Bannick,
Joseph Dielemann,
Marcia Weaver,
Christopher Murray,
Aleksandr Aravkin
Abstract:
Benchmarking tools, including stochastic frontier analysis (SFA), data envelopment analysis (DEA), and its stochastic extension (StoNED) are core tools in economics used to estimate an efficiency envelope and production inefficiencies from data. The problem appears in a wide range of fields -- for example, in global health the frontier can quantify efficiency of interventions and funding of health…
▽ More
Benchmarking tools, including stochastic frontier analysis (SFA), data envelopment analysis (DEA), and its stochastic extension (StoNED) are core tools in economics used to estimate an efficiency envelope and production inefficiencies from data. The problem appears in a wide range of fields -- for example, in global health the frontier can quantify efficiency of interventions and funding of health initiatives. Despite their wide use, classic benchmarking approaches have key limitations that preclude even wider applicability. Here we propose a robust non-parametric stochastic frontier meta-analysis (SFMA) approach that fills these gaps. First, we use flexible basis splines and shape constraints to model the frontier function, so specifying a functional form of the frontier as in classic SFA is no longer necessary. Second, the user can specify relative errors on input datapoints, enabling population-level analyses. Third, we develop a likelihood-based trimming strategy to robustify the approach to outliers, which otherwise break available benchmarking methods. We provide a custom optimization algorithm for fast and reliable performance. We implement the approach and algorithm in an open source Python package `sfma'. Synthetic and real examples show the new capabilities of the method, and are used to compare SFMA to state of the art benchmarking packages that implement DEA, SFA, and StoNED.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
An Enhanced Cross-Sectional HIV Incidence Estimator that Incorporates Prior HIV Test Results
Authors:
Marlena Bannick,
Deborah Donnell,
Richard Hayes,
Oliver Laeyendecker,
Fei Gao
Abstract:
Incidence estimation of HIV infection can be performed using recent infection testing algorithm (RITA) results from a cross-sectional sample. This allows practitioners to understand population trends in the HIV epidemic without having to perform longitudinal follow-up on a cohort of individuals. The utility of the approach is limited by its precision, driven by the (low) sensitivity of the RITA at…
▽ More
Incidence estimation of HIV infection can be performed using recent infection testing algorithm (RITA) results from a cross-sectional sample. This allows practitioners to understand population trends in the HIV epidemic without having to perform longitudinal follow-up on a cohort of individuals. The utility of the approach is limited by its precision, driven by the (low) sensitivity of the RITA at identifying recent infection. By utilizing results of previous HIV tests that individuals may have taken, we consider an enhanced RITA with increased sensitivity (and specificity). We use it to propose an enhanced estimator for incidence estimation. We prove the theoretical properties of the enhanced estimator and illustrate its numerical performance in simulation studies. We apply the estimator to data from a cluster-randomized trial to study the effect of community-level HIV interventions on HIV incidence. We demonstrate that the enhanced estimator provides a more precise estimate of HIV incidence compared to the standard estimator.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Improved convergence rates of nonparametric penalized regression under misspecified total variation
Authors:
Marlena S. Bannick,
Noah Simon
Abstract:
Penalties that induce smoothness are common in nonparametric regression. In many settings, the amount of smoothness in the data generating function will not be known. Simon and Shojaie (2021) derived convergence rates for nonparametric estimators under misspecified smoothness. We show that their theoretical convergence rates can be improved by working with convenient approximating functions. Prope…
▽ More
Penalties that induce smoothness are common in nonparametric regression. In many settings, the amount of smoothness in the data generating function will not be known. Simon and Shojaie (2021) derived convergence rates for nonparametric estimators under misspecified smoothness. We show that their theoretical convergence rates can be improved by working with convenient approximating functions. Properties of convolutions and higher-order kernels allow these approximation functions to match the true functions more closely than those used in Simon and Shojaie (2021). As a result, we obtain tighter convergence rates.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
A General Form of Covariate Adjustment in Randomized Clinical Trials
Authors:
Marlena S. Bannick,
Jun Shao,
Jingyi Liu,
Yu Du,
Yanyao Yi,
Ting Ye
Abstract:
In randomized clinical trials, adjusting for baseline covariates can improve credibility and efficiency for demonstrating and quantifying treatment effects. This article studies the augmented inverse propensity weighted (AIPW) estimator, which is a general form of covariate adjustment that uses linear, generalized linear, and non-parametric or machine learning models for the conditional mean of th…
▽ More
In randomized clinical trials, adjusting for baseline covariates can improve credibility and efficiency for demonstrating and quantifying treatment effects. This article studies the augmented inverse propensity weighted (AIPW) estimator, which is a general form of covariate adjustment that uses linear, generalized linear, and non-parametric or machine learning models for the conditional mean of the response given covariates. Under covariate-adaptive randomization, we establish general theorems that show a complete picture of the asymptotic normality, {efficiency gain, and applicability of AIPW estimators}. In particular, we provide for the first time a rigorous theoretical justification of using machine learning methods with cross-fitting for dependent data under covariate-adaptive randomization. Based on the general theorems, we offer insights on the conditions for guaranteed efficiency gain and universal applicability {under different randomization schemes}, which also motivate a joint calibration strategy using some constructed covariates after applying AIPW. Our methods are implemented in the R package RobinCar.
△ Less
Submitted 25 March, 2024; v1 submitted 16 June, 2023;
originally announced June 2023.
-
Robust Variance Estimation for Covariate-Adjusted Unconditional Treatment Effect in Randomized Clinical Trials with Binary Outcomes
Authors:
Ting Ye,
Marlena Bannick,
Yanyao Yi,
Jun Shao
Abstract:
To improve precision of estimation and power of testing hypothesis for an unconditional treatment effect in randomized clinical trials with binary outcomes, researchers and regulatory agencies recommend using g-computation as a reliable method of covariate adjustment. However, the practical application of g-computation is hindered by the lack of an explicit robust variance formula that can be used…
▽ More
To improve precision of estimation and power of testing hypothesis for an unconditional treatment effect in randomized clinical trials with binary outcomes, researchers and regulatory agencies recommend using g-computation as a reliable method of covariate adjustment. However, the practical application of g-computation is hindered by the lack of an explicit robust variance formula that can be used for different unconditional treatment effects of interest. To fill this gap, we provide explicit and robust variance estimators for g-computation estimators and demonstrate through simulations that the variance estimators can be reliably applied in practice.
△ Less
Submitted 27 March, 2023; v1 submitted 20 February, 2023;
originally announced February 2023.
-
Accounting for Inconsistent Use of Covariate Adjustment in Group Sequential Trials
Authors:
Marlena S. Bannick,
Sonya L. Heltshe,
Noah Simon
Abstract:
Group sequential designs in clinical trials allow for interim efficacy and futility monitoring. Adjustment for baseline covariates can increase power and precision of estimated effects. However, inconsistently applying covariate adjustment throughout the stages of a group sequential trial can result in inflation of type I error, biased point estimates, and anti-conservative confidence intervals. W…
▽ More
Group sequential designs in clinical trials allow for interim efficacy and futility monitoring. Adjustment for baseline covariates can increase power and precision of estimated effects. However, inconsistently applying covariate adjustment throughout the stages of a group sequential trial can result in inflation of type I error, biased point estimates, and anti-conservative confidence intervals. We propose methods for performing correct interim monitoring, estimation, and inference in this setting that avoid these issues. We focus on two-arm trials with simple, balanced randomization and continuous outcomes. We study the performance of our boundary, estimation, and inference adjustments in simulation studies. We end with recommendations about the application of covariate adjustment in group sequential designs.
△ Less
Submitted 9 August, 2023; v1 submitted 24 June, 2022;
originally announced June 2022.
-
Retrospective, Observational Studies for Estimating Vaccine Effects on the Secondary Attack Rate of SARS-CoV-2
Authors:
Marlena S. Bannick,
Fei Gao,
Elizabeth R. Brown,
Holly E. Janes
Abstract:
COVID-19 vaccines are highly efficacious at preventing symptomatic infection, severe disease, and death. Most of the evidence that COVID-19 vaccines also reduce transmission of SARS-CoV-2 is based on retrospective, observational studies. Specifically, an increasing number of studies are evaluating vaccine efficacy against the secondary attack rate of SARS-CoV-2 using data available in existing hea…
▽ More
COVID-19 vaccines are highly efficacious at preventing symptomatic infection, severe disease, and death. Most of the evidence that COVID-19 vaccines also reduce transmission of SARS-CoV-2 is based on retrospective, observational studies. Specifically, an increasing number of studies are evaluating vaccine efficacy against the secondary attack rate of SARS-CoV-2 using data available in existing healthcare databases or contact tracing databases. Since these types of databases were designed for clinical diagnosis or management of COVID-19, they are limited in their ability to provide accurate information on infection, infection timing, and transmission events. In this manuscript, we highlight challenges with using existing databases to identify transmission units and confirm potential SARS-CoV-2 transmission events. We discuss the impact of common diagnostic testing strategies including event-prompted and infrequent testing and illustrate their potential biases in estimating vaccine efficacy against the secondary attack rate of SARS-CoV-2. We articulate the need for prospective observational studies of vaccine efficacy against the SARS-CoV-2 SAR, and we provide design and reporting considerations for studies using retrospective databases.
△ Less
Submitted 12 May, 2022;
originally announced June 2022.
-
Analysis and Methods to Mitigate Effects of Under-reporting in Count Data
Authors:
Jennifer Brennan,
Marlena Bannick,
Nicholas Kassebaum,
Lauren Wilner,
Azalea Thomson,
Aleksandr Aravkin,
Peng Zheng
Abstract:
Under-reporting of count data poses a major roadblock for prediction and inference. In this paper, we focus on the Pogit model, which deconvolves the generating Poisson process from the censuring process controlling under-reporting using a generalized linear modeling framework. We highlight the limitations of the Pogit model and address them by adding constraints to the estimation framework. We al…
▽ More
Under-reporting of count data poses a major roadblock for prediction and inference. In this paper, we focus on the Pogit model, which deconvolves the generating Poisson process from the censuring process controlling under-reporting using a generalized linear modeling framework. We highlight the limitations of the Pogit model and address them by adding constraints to the estimation framework. We also develop uncertainty quantification techniques that are robust to model mis-specification. Our approach is evaluated using synthetic data and applied to real healthcare datasets, where we treat in-patient data as `reported' counts and use held-out total injuries to validate the results. The methods make it possible to separate the Poisson process from the under-reporting process, given sufficient expert information. Codes to implement the approach are available via an open source Python package.
△ Less
Submitted 27 September, 2021; v1 submitted 24 September, 2021;
originally announced September 2021.
-
Statistical Considerations for Cross-Sectional HIV Incidence Estimation Based on Recency Test
Authors:
Fei Gao,
Marlena S. Bannick
Abstract:
Longitudinal cohorts to determine the incidence of HIV infection are logistically challenging, so researchers have sought alternative strategies. Recency test methods use biomarker profiles of HIV-infected subjects in a cross-sectional sample to infer whether they are "recently" infected and to estimate incidence in the population. Two main estimators have been used in practice: one that assumes a…
▽ More
Longitudinal cohorts to determine the incidence of HIV infection are logistically challenging, so researchers have sought alternative strategies. Recency test methods use biomarker profiles of HIV-infected subjects in a cross-sectional sample to infer whether they are "recently" infected and to estimate incidence in the population. Two main estimators have been used in practice: one that assumes a recency test is perfectly specific, and another that allows for false-recent results. To date, these commonly used estimators have not been rigorously studied with respect to their assumptions and statistical properties. In this paper, we present a theoretical framework with which to understand these estimators and interrogate their assumptions, and perform a simulation study to assess the performance of these estimators under realistic HIV epidemiological dynamics. We conclude with recommendations for the use of these estimators in practice and a discussion of future methodological developments to improve HIV incidence estimation via recency test.
△ Less
Submitted 3 June, 2021;
originally announced June 2021.