Search | arXiv e-print repository

A Bayesian Classification Trees Approach to Treatment Effect Variation with Noncompliance

Authors: Jared D. Fisher, David W. Puelz, Sameer K. Deshpande

Abstract: Estimating varying treatment effects in randomized trials with noncompliance is inherently challenging since variation comes from two separate sources: variation in the impact itself and variation in the compliance rate. In this setting, existing flexible machine learning methods are highly sensitive to the weak instruments problem, in which the compliance rate is (locally) close to zero. Our main… ▽ More Estimating varying treatment effects in randomized trials with noncompliance is inherently challenging since variation comes from two separate sources: variation in the impact itself and variation in the compliance rate. In this setting, existing flexible machine learning methods are highly sensitive to the weak instruments problem, in which the compliance rate is (locally) close to zero. Our main methodological contribution is to present a Bayesian Causal Forest model for binary response variables in scenarios with noncompliance. By repeatedly imputing individuals' compliance types, we can flexibly estimate heterogeneous treatment effects among compliers. Simulation studies demonstrate the usefulness of our approach when compliance and treatment effects are heterogeneous. We apply the method to detect and analyze heterogeneity in the treatment effects in the Illinois Workplace Wellness Study, which not only features heterogeneous and one-sided compliance but also several binary outcomes of interest. We demonstrate the methodology on three outcomes one year after intervention. We confirm a null effect on the presence of a chronic condition, discover meaningful heterogeneity impact of the intervention on metabolic parameters though the average effect is null in classical partial effect estimates, and find substantial heterogeneity in individuals' perception of management prioritization of health and safety. △ Less

Submitted 26 August, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

Comments: 24 pages, 8 figures

arXiv:1912.10334 [pdf, other]

A Symmetric Prior for Multinomial Probit Models

Authors: Lane F. Burgette, David Puelz, P. Richard Hahn

Abstract: Fitted probabilities from widely used Bayesian multinomial probit models can depend strongly on the choice of a base category, which is used to uniquely identify the parameters of the model. This paper proposes a novel identification strategy, and associated prior distribution for the model parameters, that renders the prior symmetric with respect to relabeling the outcome categories. The new prio… ▽ More Fitted probabilities from widely used Bayesian multinomial probit models can depend strongly on the choice of a base category, which is used to uniquely identify the parameters of the model. This paper proposes a novel identification strategy, and associated prior distribution for the model parameters, that renders the prior symmetric with respect to relabeling the outcome categories. The new prior permits an efficient Gibbs algorithm that samples rank-deficient covariance matrices without resorting to Metropolis-Hastings updates. △ Less

Submitted 17 May, 2020; v1 submitted 21 December, 2019; originally announced December 2019.

arXiv:1910.10862 [pdf, other]

A Graph-Theoretic Approach to Randomization Tests of Causal Effects Under General Interference

Authors: David Puelz, Guillaume Basse, Avi Feller, Panos Toulis

Abstract: Interference exists when a unit's outcome depends on another unit's treatment assignment. For example, intensive policing on one street could have a spillover effect on neighboring streets. Classical randomization tests typically break down in this setting because many null hypotheses of interest are no longer sharp under interference. A promising alternative is to instead construct a conditional… ▽ More Interference exists when a unit's outcome depends on another unit's treatment assignment. For example, intensive policing on one street could have a spillover effect on neighboring streets. Classical randomization tests typically break down in this setting because many null hypotheses of interest are no longer sharp under interference. A promising alternative is to instead construct a conditional randomization test on a subset of units and assignments for which a given null hypothesis is sharp. Finding these subsets is challenging, however, and existing methods are limited to special cases or have limited power. In this paper, we propose valid and easy-to-implement randomization tests for a general class of null hypotheses under arbitrary interference between units. Our key idea is to represent the hypothesis of interest as a bipartite graph between units and assignments, and to find an appropriate biclique of this graph. Importantly, the null hypothesis is sharp within this biclique, enabling conditional randomization-based tests. We also connect the size of the biclique to statistical power. Moreover, we can apply off-the-shelf graph clustering methods to find such bicliques efficiently and at scale. We illustrate our approach in settings with clustered interference and show advantages over methods designed specifically for that setting. We then apply our method to a large-scale policing experiment in Medellin, Colombia, where interference has a spatial structure. △ Less

Submitted 25 May, 2021; v1 submitted 23 October, 2019; originally announced October 2019.

arXiv:1706.10180 [pdf, other]

Regret-based Selection for Sparse Dynamic Portfolios

Authors: David Puelz, P. Richard Hahn, Carlos Carvalho

Abstract: This paper considers portfolio construction in a dynamic setting. We specify a loss function comprised of utility and complexity components with an unknown tradeoff parameter. We develop a novel regret-based criterion for selecting the tradeoff parameter to construct optimal sparse portfolios over time. This paper considers portfolio construction in a dynamic setting. We specify a loss function comprised of utility and complexity components with an unknown tradeoff parameter. We develop a novel regret-based criterion for selecting the tradeoff parameter to construct optimal sparse portfolios over time. △ Less

Submitted 23 July, 2017; v1 submitted 30 June, 2017; originally announced June 2017.

arXiv:1605.08963 [pdf, other]

Variable Selection in Seemingly Unrelated Regressions with Random Predictors

Authors: David Puelz, P. Richard Hahn, Carlos Carvalho

Abstract: This paper considers linear model selection when the response is vector-valued and the predictors are randomly observed. We propose a new approach that decouples statistical inference from the selection step in a "post-inference model summarization" strategy. We study the impact of predictor uncertainty on the model selection procedure. The method is demonstrated through an application to asset pr… ▽ More This paper considers linear model selection when the response is vector-valued and the predictors are randomly observed. We propose a new approach that decouples statistical inference from the selection step in a "post-inference model summarization" strategy. We study the impact of predictor uncertainty on the model selection procedure. The method is demonstrated through an application to asset pricing. △ Less

Submitted 3 June, 2016; v1 submitted 29 May, 2016; originally announced May 2016.

arXiv:1602.02176 [pdf, other]

Regularization and confounding in linear regression for treatment effect estimation

Authors: P. Richard Hahn, Carlos M. Carvalho, Jingyu He, David Puelz

Abstract: This paper investigates the use of regularization priors in the context of treatment effect estimation using observational data where the number of control variables is large relative to the number of observations. First, the phenomenon of regularization-induced confounding is introduced, which refers to the tendency of regularization priors to adversely bias treatment effect estimates by over-shr… ▽ More This paper investigates the use of regularization priors in the context of treatment effect estimation using observational data where the number of control variables is large relative to the number of observations. First, the phenomenon of regularization-induced confounding is introduced, which refers to the tendency of regularization priors to adversely bias treatment effect estimates by over-shrinking control variable regression coefficients. Then, a simultaneous regression model is presented which permits regularization priors to be specified in a way that avoids this unintentional re-confounding. The new model is illustrated on synthetic and empirical data. △ Less

Submitted 27 December, 2016; v1 submitted 5 February, 2016; originally announced February 2016.

arXiv:1510.03385 [pdf, other]

Optimal ETF Selection for Passive Investing

Authors: David Puelz, Carlos M. Carvalho, P. Richard Hahn

Abstract: This paper considers the problem of isolating a small number of exchange traded funds (ETFs) that suffice to capture the fundamental dimensions of variation in U.S. financial markets. First, the data is fit to a vector-valued Bayesian regression model, which is a matrix-variate generalization of the well known stochastic search variable selection (SSVS) of George and McCulloch (1993). ETF selectio… ▽ More This paper considers the problem of isolating a small number of exchange traded funds (ETFs) that suffice to capture the fundamental dimensions of variation in U.S. financial markets. First, the data is fit to a vector-valued Bayesian regression model, which is a matrix-variate generalization of the well known stochastic search variable selection (SSVS) of George and McCulloch (1993). ETF selection is then performed using the decoupled shrinkage and selection (DSS) procedure described in Hahn and Carvalho (2015), adapted in two ways: to the vector-response setting and to incorporate stochastic covariates. The selected set of ETFs is obtained under a number of different penalty and modeling choices. Optimal portfolios are constructed from selected ETFs by maximizing the Sharpe ratio posterior mean, and they are compared to the (unknown) optimal portfolio based on the full Bayesian model. We compare our selection results to popular ETF advisor Wealthfront.com. Additionally, we consider selecting ETFs by modeling a large set of mutual funds. △ Less

Submitted 28 November, 2015; v1 submitted 12 October, 2015; originally announced October 2015.

Showing 1–7 of 7 results for author: Puelz, D