Skip to main content

Showing 1–33 of 33 results for author: Volfovsky, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.24296  [pdf, ps, other

    stat.ME cs.LG econ.EM

    Data Fusion for Partial Identification of Causal Effects

    Authors: Quinn Lanners, Cynthia Rudin, Alexander Volfovsky, Harsh Parikh

    Abstract: Data fusion techniques integrate information from heterogeneous data sources to improve learning, generalization, and decision making across data sciences. In causal inference, these methods leverage rich observational data to improve causal effect estimation, while maintaining the trustworthiness of randomized controlled trials. Existing approaches often relax the strong no unobserved confounding… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  2. arXiv:2505.18118  [pdf, ps, other

    stat.ML cs.LG

    Scalable Policy Maximization Under Network Interference

    Authors: Aidan Gleich, Eric Laber, Alexander Volfovsky

    Abstract: Many interventions, such as vaccines in clinical trials or coupons in online marketplaces, must be assigned sequentially without full knowledge of their effects. Multi-armed bandit algorithms have proven successful in such settings. However, standard independence assumptions fail when the treatment status of one individual impacts the outcomes of others, a phenomenon known as interference. We stud… ▽ More

    Submitted 23 May, 2025; originally announced May 2025.

  3. arXiv:2501.01505  [pdf, other

    stat.ME math.ST

    Reinforcement Learning for Respondent-Driven Sampling

    Authors: Justin Weltz, Angela Yoon, Yichi Zhang, Alexander Volfovsky, Eric Laber

    Abstract: Respondent-driven sampling (RDS) is widely used to study hidden or hard-to-reach populations by incentivizing study participants to recruit their social connections. The success and efficiency of RDS can depend critically on the nature of the incentives, including their number, value, call to action, etc. Standard RDS uses an incentive structure that is set a priori and held fixed throughout the s… ▽ More

    Submitted 2 January, 2025; originally announced January 2025.

  4. arXiv:2312.10569  [pdf, other

    cs.LG eess.SP stat.ME

    Interpretable Causal Inference for Analyzing Wearable, Sensor, and Distributional Data

    Authors: Srikar Katta, Harsh Parikh, Cynthia Rudin, Alexander Volfovsky

    Abstract: Many modern causal questions ask how treatments affect complex outcomes that are measured using wearable devices and sensors. Current analysis approaches require summarizing these data into scalar statistics (e.g., the mean), but these summaries can be misleading. For example, disparate distributions can have the same means, variances, and other statistics. Researchers can overcome the loss of inf… ▽ More

    Submitted 20 March, 2024; v1 submitted 16 December, 2023; originally announced December 2023.

  5. arXiv:2310.15333  [pdf, other

    cs.LG stat.AP stat.ME

    Safe and Interpretable Estimation of Optimal Treatment Regimes

    Authors: Harsh Parikh, Quinn Lanners, Zade Akras, Sahar F. Zafar, M. Brandon Westover, Cynthia Rudin, Alexander Volfovsky

    Abstract: Recent statistical and reinforcement learning methods have significantly advanced patient care strategies. However, these approaches face substantial challenges in high-stakes contexts, including missing data, inherent stochasticity, and the critical requirements for interpretability and patient safety. Our work operationalizes a safe and interpretable framework to identify optimal treatment regim… ▽ More

    Submitted 1 April, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: Accepted for publication in the proceedings of AISTATS 2025

  6. arXiv:2307.01449  [pdf, other

    stat.ME cs.AI cs.LG econ.EM

    A Double Machine Learning Approach to Combining Experimental and Observational Data

    Authors: Harsh Parikh, Marco Morucci, Vittorio Orlandi, Sudeepa Roy, Cynthia Rudin, Alexander Volfovsky

    Abstract: Experimental and observational studies often lack validity due to untestable assumptions. We propose a double machine learning approach to combine experimental and observational studies, allowing practitioners to test for assumption violations and estimate treatment effects consistently. Our framework tests for violations of external validity and ignorability under milder assumptions. When only on… ▽ More

    Submitted 2 April, 2024; v1 submitted 3 July, 2023; originally announced July 2023.

  7. arXiv:2305.07776  [pdf, ps, other

    stat.AP

    Identifying World Events in Dynamic International Relations Data Using a Latent Space Model

    Authors: Yunran Chen, Alexander Volfovsky

    Abstract: Dynamic network data have become ubiquitous in social network analysis, with new information becoming available that captures when friendships form, when corporate transactions happen and when countries interact with each other. Flexible and interpretable models are needed in order to properly capture the behavior of individuals in such networks. In this paper, we focus on study the underlying lat… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

  8. arXiv:2304.01316  [pdf, other

    stat.ME stat.ML

    Matched Machine Learning: A Generalized Framework for Treatment Effect Inference With Learned Metrics

    Authors: Marco Morucci, Cynthia Rudin, Alexander Volfovsky

    Abstract: We introduce Matched Machine Learning, a framework that combines the flexibility of machine learning black boxes with the interpretability of matching, a longstanding tool in observational causal inference. Interpretability is paramount in many high-stakes application of causal inference. Current tools for nonparametric estimation of both average and individualized treatment effects are black-boxe… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  9. arXiv:2302.11715  [pdf, other

    stat.ME cs.LG econ.EM

    Variable Importance Matching for Causal Inference

    Authors: Quinn Lanners, Harsh Parikh, Alexander Volfovsky, Cynthia Rudin, David Page

    Abstract: Our goal is to produce methods for observational causal inference that are auditable, easy to troubleshoot, accurate for treatment effect estimation, and scalable to high-dimensional data. We describe a general framework called Model-to-Match that achieves these goals by (i) learning a distance metric via outcome modeling, (ii) creating matched groups using the distance metric, and (iii) using the… ▽ More

    Submitted 28 June, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

    Journal ref: Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:1174-1184, 2023

  10. arXiv:2212.03683  [pdf, ps, other

    stat.ML cs.LG econ.EM

    Neighborhood Adaptive Estimators for Causal Inference under Network Interference

    Authors: Alexandre Belloni, Fei Fang, Alexander Volfovsky

    Abstract: Estimating causal effects has become an integral part of most applied fields. In this work we consider the violation of the classical no-interference assumption with units connected by a network. For tractability, we consider a known network that describes how interference may spread. Unlike previous work the radius (and intensity) of the interference experienced by a unit is unknown and can depen… ▽ More

    Submitted 3 March, 2025; v1 submitted 7 December, 2022; originally announced December 2022.

  11. arXiv:2206.14570  [pdf, other

    stat.AP

    Bias and Excess Variance in Election Polling: A Not-So-Hidden Markov Model

    Authors: Graham Tierney, Alexander Volfovsky

    Abstract: With historic misses in the 2016 and 2020 US Presidential elections, interest in measuring polling errors has increased. The most common method for measuring directional errors and non-sampling excess variability during a postmortem for an election is by assessing the difference between the poll result and election result for polls conducted within a few days of the day of the election. Analyzing… ▽ More

    Submitted 17 February, 2023; v1 submitted 16 June, 2022; originally announced June 2022.

  12. Effects of Epileptiform Activity on Discharge Outcome in Critically Ill Patients

    Authors: Harsh Parikh, Kentaro Hoffman, Haoqi Sun, Wendong Ge, Jin Jing, Rajesh Amerineni, Lin Liu, Jimeng Sun, Sahar Zafar, Aaron Struck, Alexander Volfovsky, Cynthia Rudin, M. Brandon Westover

    Abstract: Epileptiform activity (EA) is associated with worse outcomes including increased risk of disability and death. However, the effect of EA on the neurologic outcome is confounded by the feedback between treatment with anti-seizure medications (ASM) and EA burden. A randomized clinical trial is challenging due to the sequential nature of EA-ASM feedback, as well as ethical reasons. However, some mech… ▽ More

    Submitted 11 March, 2023; v1 submitted 9 March, 2022; originally announced March 2022.

    Comments: 4 Figures

  13. arXiv:2112.12259  [pdf, other

    stat.ME

    Density Regression with Bayesian Additive Regression Trees

    Authors: Vittorio Orlandi, Jared Murray, Antonio Linero, Alexander Volfovsky

    Abstract: Flexibly modeling how an entire density changes with covariates is an important but challenging generalization of mean and quantile regression. While existing methods for density regression primarily consist of covariate-dependent discrete mixture models, we consider a continuous latent variable model in general covariate spaces, which we call DR-BART. The prior mapping the latent variable to the… ▽ More

    Submitted 22 December, 2021; originally announced December 2021.

    Comments: 30 pages, 12 figures

  14. arXiv:2112.07892  [pdf, other

    stat.ME stat.AP

    Likelihood-based inference for partially observed stochastic epidemics with individual heterogeneity

    Authors: Fan Bu, Allison E. Aiello, Alexander Volfovsky, Jason Xu

    Abstract: We develop a stochastic epidemic model progressing over dynamic networks, where infection rates are heterogeneous and may vary with individual-level covariates. The joint dynamics are modeled as a continuous-time Markov chain such that disease transmission is constrained by the contact network structure, and network evolution is in turn influenced by individual disease statuses. To accommodate par… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.

  15. arXiv:2112.06097  [pdf, other

    stat.ME

    Latent Community Adaptive Network Regression

    Authors: Heather Mathews, Alexander Volfovsky

    Abstract: The study of network data in the social and health sciences frequently concentrates on two distinct tasks (1) detecting community structures among nodes and (2) associating covariate information to edge formation. In much of this data, it is likely that the effects of covariates on edge formation differ between communities (e.g. age might play a different role in friendship formation in communitie… ▽ More

    Submitted 11 December, 2021; originally announced December 2021.

  16. arXiv:2106.09533  [pdf, other

    cs.IR cs.LG stat.ME stat.ML

    Author Clustering and Topic Estimation for Short Texts

    Authors: Graham Tierney, Christopher Bail, Alexander Volfovsky

    Abstract: Analysis of short text, such as social media posts, is extremely difficult because of their inherent brevity. In addition to classifying topics of such posts, a common downstream task is grouping the authors of these documents for subsequent analyses. We propose a novel model that expands on the Latent Dirichlet Allocation by modeling strong dependence among the words in the same document, with us… ▽ More

    Submitted 16 June, 2022; v1 submitted 15 June, 2021; originally announced June 2021.

  17. arXiv:2003.01805  [pdf, other

    stat.ME cs.LG

    Adaptive Hyper-box Matching for Interpretable Individualized Treatment Effect Estimation

    Authors: Marco Morucci, Vittorio Orlandi, Sudeepa Roy, Cynthia Rudin, Alexander Volfovsky

    Abstract: We propose a matching method for observational data that matches units with others in unit-specific, hyper-box-shaped regions of the covariate space. These regions are large enough that many matches are created for each unit and small enough that the treatment effect is roughly constant throughout. The regions are found as either the solution to a mixed integer program, or using a (fast) approxima… ▽ More

    Submitted 8 August, 2020; v1 submitted 3 March, 2020; originally announced March 2020.

    Journal ref: Proceedings of the Thirty-sixth Conference on Uncertainty in Artificial Intelligence (UAI 2020)

  18. arXiv:2003.00964  [pdf, other

    stat.ME cs.LG

    Almost-Matching-Exactly for Treatment Effect Estimation under Network Interference

    Authors: M. Usaid Awan, Marco Morucci, Vittorio Orlandi, Sudeepa Roy, Cynthia Rudin, Alexander Volfovsky

    Abstract: We propose a matching method that recovers direct treatment effects from randomized experiments where units are connected in an observed network, and units that share edges can potentially influence each others' outcomes. Traditional treatment effect estimators for randomized experiments are biased and error prone in this setting. Our method matches units almost exactly on counts of unique subgrap… ▽ More

    Submitted 2 March, 2020; originally announced March 2020.

    Comments: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020)

  19. arXiv:1911.01855  [pdf, other

    stat.ME

    Gaussian Mixture Models for Stochastic Block Models with Non-Vanishing Noise

    Authors: Heather Mathews, Vaishakhi Mayya, Alexander Volfovsky, Galen Reeves

    Abstract: Community detection tasks have received a lot of attention across statistics, machine learning, and information theory with a large body of work concentrating on theoretical guarantees for the stochastic block model. One line of recent work has focused on modeling the spectral embedding of a network using Gaussian mixture models (GMMs) in scaling regimes where the ability to detect community membe… ▽ More

    Submitted 5 November, 2019; originally announced November 2019.

  20. arXiv:1910.04221  [pdf, other

    stat.ME physics.soc-ph q-bio.PE

    Likelihood-based Inference for Partially Observed Epidemics on Dynamic Networks

    Authors: Fan Bu, Allison E. Aiello, Jason Xu, Alexander Volfovsky

    Abstract: We propose a generative model and an inference scheme for epidemic processes on dynamic, adaptive contact networks. Network evolution is formulated as a link-Markovian process, which is then coupled to an individual-level stochastic SIR model, in order to describe the interplay between epidemic dynamics on a network and network link changes. A Markov chain Monte Carlo framework is developed for li… ▽ More

    Submitted 5 April, 2020; v1 submitted 9 October, 2019; originally announced October 2019.

  21. arXiv:1906.11658  [pdf, other

    stat.ME cs.LG

    Interpretable Almost-Matching-Exactly With Instrumental Variables

    Authors: M. Usaid Awan, Yameng Liu, Marco Morucci, Sudeepa Roy, Cynthia Rudin, Alexander Volfovsky

    Abstract: Uncertainty in the estimation of the causal effect in observational studies is often due to unmeasured confounding, i.e., the presence of unobserved covariates linking treatments and outcomes. Instrumental Variables (IV) are commonly used to reduce the effects of unmeasured confounding. Existing methods for IV estimation either require strong parametric assumptions, use arbitrary distance metrics,… ▽ More

    Submitted 28 July, 2019; v1 submitted 27 June, 2019; originally announced June 2019.

    Journal ref: Proceedings of the Thirty-fifth Conference on Uncertainty in Artificial Intelligence (UAI 2019)

  22. arXiv:1811.07415  [pdf, other

    stat.ME cs.LG econ.EM

    MALTS: Matching After Learning to Stretch

    Authors: Harsh Parikh, Cynthia Rudin, Alexander Volfovsky

    Abstract: We introduce a flexible framework that produces high-quality almost-exact matches for causal inference. Most prior work in matching uses ad-hoc distance metrics, often leading to poor quality matches, particularly when there are irrelevant covariates. In this work, we learn an interpretable distance metric for matching, which leads to substantially higher quality matches. The learned distance metr… ▽ More

    Submitted 7 June, 2023; v1 submitted 18 November, 2018; originally announced November 2018.

    Comments: 40 pages, 5 Tables, 12 Figures

    Journal ref: Journal.of.Machine.Learning.Research 23(240) (2022) 1-42

  23. arXiv:1806.06802  [pdf, other

    stat.ML cs.LG stat.ME

    Interpretable Almost Matching Exactly for Causal Inference

    Authors: Yameng Liu, Aw Dieng, Sudeepa Roy, Cynthia Rudin, Alexander Volfovsky

    Abstract: We aim to create the highest possible quality of treatment-control matches for categorical data in the potential outcomes framework. Matching methods are heavily used in the social sciences due to their interpretability, but most matching methods do not pass basic sanity checks: they fail when irrelevant variables are introduced, and tend to be either computationally slow or produce low-quality ma… ▽ More

    Submitted 8 June, 2019; v1 submitted 18 June, 2018; originally announced June 2018.

    Comments: AISTATS 2019

  24. arXiv:1806.06696  [pdf, other

    stat.AP

    SMOGS: Social Network Metrics of Game Success

    Authors: Fan Bu, Sonia Xu, Katherine Heller, Alexander Volfovsky

    Abstract: This paper develops metrics from a social network perspective that are directly translatable to the outcome of a basketball game. We extend a state-of-the-art multi-resolution stochastic process approach to modeling basketball by modeling passes between teammates as directed dynamic relational links on a network and introduce multiplicative latent factors to study higher-order patterns in players'… ▽ More

    Submitted 18 June, 2018; originally announced June 2018.

    Journal ref: PMLR 2019 89:2406-2414

  25. arXiv:1801.07310  [pdf, other

    math.ST stat.ME

    Propensity score methodology in the presence of network entanglement between treatments

    Authors: Panos Toulis, Alexander Volfovsky, Edoardo M. Airoldi

    Abstract: In experimental design and causal inference, it may happen that the treatment is not defined on individual experimental units, but rather on pairs or, more generally, on groups of units. For example, teachers may choose pairs of students who do not know each other to teach a new curriculum; regulators might allow or disallow merging of firms, and biologists may introduce or inhibit interactions be… ▽ More

    Submitted 22 January, 2018; originally announced January 2018.

  26. arXiv:1707.06315  [pdf, other

    stat.ML cs.DB

    FLAME: A Fast Large-scale Almost Matching Exactly Approach to Causal Inference

    Authors: Tianyu Wang, Marco Morucci, M. Usaid Awan, Yameng Liu, Sudeepa Roy, Cynthia Rudin, Alexander Volfovsky

    Abstract: A classical problem in causal inference is that of matching, where treatment units need to be matched to control units based on covariate information. In this work, we propose a method that computes high quality almost-exact matches for high-dimensional categorical datasets. This method, called FLAME (Fast Large-scale Almost Matching Exactly), learns a distance metric for matching using a hold-out… ▽ More

    Submitted 4 February, 2021; v1 submitted 19 July, 2017; originally announced July 2017.

    Journal ref: Journal of Machine Learning Research, 22 (2021) 1-41

  27. arXiv:1601.04083  [pdf, other

    stat.ME stat.AP

    Observational studies with unknown time of treatment

    Authors: Guillaume W. Basse, Alexander Volfovsky, Edoardo M. Airoldi

    Abstract: Time plays a fundamental role in causal analyses, where the goal is to quantify the effect of a specific treatment on future outcomes. In a randomized experiment, times of treatment, and when outcomes are observed, are typically well defined. In an observational study, treatment time marks the point from which pre-treatment variables must be regarded as outcomes, and it is often straightforward to… ▽ More

    Submitted 15 January, 2016; originally announced January 2016.

  28. arXiv:1506.07925  [pdf, other

    stat.CO stat.ML

    Analyzing statistical and computational tradeoffs of estimation procedures

    Authors: Daniel L. Sussman, Alexander Volfovsky, Edoardo M. Airoldi

    Abstract: The recent explosion in the amount and dimensionality of data has exacerbated the need of trading off computational and statistical efficiency carefully, so that inference is both tractable and meaningful. We propose a framework that provides an explicit opportunity for practitioners to specify how much statistical risk they are willing to accept for a given computational cost, and leads to a theo… ▽ More

    Submitted 25 June, 2015; originally announced June 2015.

  29. arXiv:1501.01234  [pdf, other

    stat.ME

    Causal inference for ordinal outcomes

    Authors: Alexander Volfovsky, Edoardo M. Airoldi, Donald B. Rubin

    Abstract: Many outcomes of interest in the social and health sciences, as well as in modern applications in computational social science and experimentation on social media platforms, are ordinal and do not have a meaningful scale. Causal analyses that leverage this type of data, termed ordinal non-numeric, require careful treatment, as much of the classical potential outcomes literature is concerned with e… ▽ More

    Submitted 6 January, 2015; originally announced January 2015.

  30. arXiv:1411.0647  [pdf, other

    stat.AP

    Multiple Imputation Using Gaussian Copulas

    Authors: Florian M. Hollenbach, Iavor Bojinov, Shahryar Minhas, Nils W. Metternich, Shahryar Minhas, Michael D. Ward, Alexander Volfovsky

    Abstract: Missing observations are pervasive throughout empirical research, especially in the social sciences. Despite multiple approaches to dealing adequately with missing data, many scholars still fail to address this vital issue. In this paper, we present a simple-to-use method for generating multiple imputations using a Gaussian copula. The Gaussian copula for multiple imputation (Hoff, 2007) allows sc… ▽ More

    Submitted 4 October, 2018; v1 submitted 3 November, 2014; originally announced November 2014.

  31. arXiv:1306.5786  [pdf, other

    math.ST stat.ME

    Testing for nodal dependence in relational data matrices

    Authors: Alexander Volfovsky, Peter D. Hoff

    Abstract: Relational data are often represented as a square matrix, the entries of which record the relationships between pairs of objects. Many statistical methods for the analysis of such data assume some degree of similarity or dependence between objects in terms of the way they relate to each other. However, formal tests for such dependence have not been developed. We provide a test for such dependence… ▽ More

    Submitted 24 June, 2013; originally announced June 2013.

  32. arXiv:1212.6234  [pdf, other

    stat.ME

    Likelihoods for fixed rank nomination networks

    Authors: Peter Hoff, Bailey Fosdick, Alex Volfovsky, Katherine Stovel

    Abstract: Many studies that gather social network data use survey methods that lead to censored, missing or otherwise incomplete information. For example, the popular fixed rank nomination (FRN) scheme, often used in studies of schools and businesses, asks study participants to nominate and rank at most a small number of contacts or friends, leaving the existence other relations uncertain. However, most sta… ▽ More

    Submitted 26 December, 2012; originally announced December 2012.

    MSC Class: 62N01

  33. Hierarchical array priors for ANOVA decompositions of cross-classified data

    Authors: Alexander Volfovsky, Peter D. Hoff

    Abstract: ANOVA decompositions are a standard method for describing and estimating heterogeneity among the means of a response variable across levels of multiple categorical factors. In such a decomposition, the complete set of main effects and interaction terms can be viewed as a collection of vectors, matrices and arrays that share various index sets defined by the factor levels. For many types of categor… ▽ More

    Submitted 14 April, 2014; v1 submitted 8 August, 2012; originally announced August 2012.

    Comments: Published in at http://dx.doi.org/10.1214/13-AOAS685 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS685

    Journal ref: Annals of Applied Statistics 2014, Vol. 8, No. 1, 19-47