-
Learning under random distributional shifts
Authors:
Kirk Bansak,
Elisabeth Paulson,
Dominik Rothenhäusler
Abstract:
Many existing approaches for generating predictions in settings with distribution shift model distribution shifts as adversarial or low-rank in suitable representations. In various real-world settings, however, we might expect shifts to arise through the superposition of many small and random changes in the population and environment. Thus, we consider a class of random distribution shift models t…
▽ More
Many existing approaches for generating predictions in settings with distribution shift model distribution shifts as adversarial or low-rank in suitable representations. In various real-world settings, however, we might expect shifts to arise through the superposition of many small and random changes in the population and environment. Thus, we consider a class of random distribution shift models that capture arbitrary changes in the underlying covariate space, and dense, random shocks to the relationship between the covariates and the outcomes. In this setting, we characterize the benefits and drawbacks of several alternative prediction strategies: the standard approach that directly predicts the long-term outcome of interest, the proxy approach that directly predicts a shorter-term proxy outcome, and a hybrid approach that utilizes both the long-term policy outcome and (shorter-term) proxy outcome(s). We show that the hybrid approach is robust to the strength of the distribution shift and the proxy relationship. We apply this method to datasets in two high-impact domains: asylum-seeker assignment and early childhood education. In both settings, we find that the proposed approach results in substantially lower mean-squared error than current approaches.
△ Less
Submitted 30 October, 2023; v1 submitted 5 June, 2023;
originally announced June 2023.
-
Leveraging the Power of Place: A Data-Driven Decision Helper to Improve the Location Decisions of Economic Immigrants
Authors:
Jeremy Ferwerda,
Nicholas Adams-Cohen,
Kirk Bansak,
Jennifer Fei,
Duncan Lawrence,
Jeremy M. Weinstein,
Jens Hainmueller
Abstract:
A growing number of countries have established programs to attract immigrants who can contribute to their economy. Research suggests that an immigrant's initial arrival location plays a key role in shaping their economic success. Yet immigrants currently lack access to personalized information that would help them identify optimal destinations. Instead, they often rely on availability heuristics,…
▽ More
A growing number of countries have established programs to attract immigrants who can contribute to their economy. Research suggests that an immigrant's initial arrival location plays a key role in shaping their economic success. Yet immigrants currently lack access to personalized information that would help them identify optimal destinations. Instead, they often rely on availability heuristics, which can lead to the selection of sub-optimal landing locations, lower earnings, elevated outmigration rates, and concentration in the most well-known locations. To address this issue and counteract the effects of cognitive biases and limited information, we propose a data-driven decision helper that draws on behavioral insights, administrative data, and machine learning methods to inform immigrants' location decisions. The decision helper provides personalized location recommendations that reflect immigrants' preferences as well as data-driven predictions of the locations where they maximize their expected earnings given their profile. We illustrate the potential impact of our approach using backtests conducted with administrative data that links landing data of recent economic immigrants from Canada's Express Entry system with their earnings retrieved from tax records. Simulations across various scenarios suggest that providing location recommendations to incoming economic immigrants can increase their initial earnings and lead to a mild shift away from the most populous landing destinations. Our approach can be implemented within existing institutional structures at minimal cost, and offers governments an opportunity to harness their administrative data to improve outcomes for economic immigrants.
△ Less
Submitted 27 July, 2020;
originally announced July 2020.
-
Outcome-Driven Dynamic Refugee Assignment with Allocation Balancing
Authors:
Kirk Bansak,
Elisabeth Paulson
Abstract:
This study proposes two new dynamic assignment algorithms to match refugees and asylum seekers to geographic localities within a host country. The first, currently implemented in a multi-year randomized control trial in Switzerland, seeks to maximize the average predicted employment level (or any measured outcome of interest) of refugees through a minimum-discord online assignment algorithm. The p…
▽ More
This study proposes two new dynamic assignment algorithms to match refugees and asylum seekers to geographic localities within a host country. The first, currently implemented in a multi-year randomized control trial in Switzerland, seeks to maximize the average predicted employment level (or any measured outcome of interest) of refugees through a minimum-discord online assignment algorithm. The performance of this algorithm is tested on real refugee resettlement data from both the US and Switzerland, where we find that it is able to achieve near-optimal expected employment compared to the hindsight-optimal solution, and is able to improve upon the status quo procedure by 40-50%. However, pure outcome maximization can result in a periodically imbalanced allocation to the localities over time, leading to implementation difficulties and an undesirable workflow for resettlement resources and agents. To address these problems, the second algorithm balances the goal of improving refugee outcomes with the desire for an even allocation over time. We find that this algorithm can achieve near-perfect balance over time with only a small loss in expected employment compared to the employment-maximizing algorithm. In addition, the allocation balancing algorithm offers a number of ancillary benefits compared to pure outcome maximization, including robustness to unknown arrival flows and greater exploration.
△ Less
Submitted 24 May, 2024; v1 submitted 2 July, 2020;
originally announced July 2020.
-
Combining Outcome-Based and Preference-Based Matching: A Constrained Priority Mechanism
Authors:
Avidit Acharya,
Kirk Bansak,
Jens Hainmueller
Abstract:
We introduce a constrained priority mechanism that combines outcome-based matching from machine-learning with preference-based allocation schemes common in market design. Using real-world data, we illustrate how our mechanism could be applied to the assignment of refugee families to host country locations, and kindergarteners to schools. Our mechanism allows a planner to first specify a threshold…
▽ More
We introduce a constrained priority mechanism that combines outcome-based matching from machine-learning with preference-based allocation schemes common in market design. Using real-world data, we illustrate how our mechanism could be applied to the assignment of refugee families to host country locations, and kindergarteners to schools. Our mechanism allows a planner to first specify a threshold $\bar g$ for the minimum acceptable average outcome score that should be achieved by the assignment. In the refugee matching context, this score corresponds to the predicted probability of employment, while in the student assignment context it corresponds to standardized test scores. The mechanism is a priority mechanism that considers both outcomes and preferences by assigning agents (refugee families, students) based on their preferences, but subject to meeting the planner's specified threshold. The mechanism is both strategy-proof and constrained efficient in that it always generates a matching that is not Pareto dominated by any other matching that respects the planner's threshold.
△ Less
Submitted 11 August, 2020; v1 submitted 19 February, 2019;
originally announced February 2019.
-
Estimating Causal Moderation Effects with Randomized Treatments and Non-Randomized Moderators
Authors:
Kirk Bansak
Abstract:
Researchers are often interested in analyzing conditional treatment effects. One variant of this is "causal moderation," which implies that intervention upon a third (moderator) variable would alter the treatment effect. This study considers the conditions under which causal moderation can be identified and presents a generalized framework for estimating causal moderation effects given randomized…
▽ More
Researchers are often interested in analyzing conditional treatment effects. One variant of this is "causal moderation," which implies that intervention upon a third (moderator) variable would alter the treatment effect. This study considers the conditions under which causal moderation can be identified and presents a generalized framework for estimating causal moderation effects given randomized treatments and non-randomized moderators. As part of the estimation process, it allows researchers to implement their preferred method of covariate adjustment, including parametric and non-parametric methods, or alternative identification strategies of their choosing. In addition, it provides a set-up whereby sensitivity analysis designed for the average-treatment-effect context can be extended to the moderation context. To illustrate the methods, the study presents two applications: one dealing with the effect of using the term "welfare" to describe public assistance in the United States, and one dealing with the effect of asylum seekers' religion on European attitudes toward asylum seekers.
△ Less
Submitted 24 August, 2020; v1 submitted 9 October, 2017;
originally announced October 2017.
-
Comparative Causal Mediation and Relaxing the Assumption of No Mediator-Outcome Confounding: An Application to International Law and Audience Costs
Authors:
Kirk Bansak
Abstract:
Experiments often include multiple treatments, with the primary goal to compare the causal effects of those treatments. This study focuses on comparing the causal anatomies of multiple treatments through the use of causal mediation analysis. It proposes a novel set of comparative causal mediation (CCM) estimands that compare the mediation effects of different treatments via a common mediator. Furt…
▽ More
Experiments often include multiple treatments, with the primary goal to compare the causal effects of those treatments. This study focuses on comparing the causal anatomies of multiple treatments through the use of causal mediation analysis. It proposes a novel set of comparative causal mediation (CCM) estimands that compare the mediation effects of different treatments via a common mediator. Further, it derives the properties of a set of estimators for the CCM estimands and shows these estimators to be consistent (or conservative) under assumptions that do not require the absence of unobserved confounding of the mediator-outcome relationship, which is a strong and nonrefutable assumption that must typically be made for consistent estimation of individual causal mediation effects. To illustrate the method, the study presents an original application investigating whether and how the international legal status of a foreign policy commitment can increase the domestic political "audience costs" that democratic governments suffer for violating such a commitment. The results provide novel evidence that international legalization can enhance audience costs via multiple causal channels, including by amplifying the perceived immorality of violating the commitment.
△ Less
Submitted 1 July, 2019; v1 submitted 27 December, 2016;
originally announced December 2016.
-
A Generalized Approach to Power Analysis for Local Average Treatment Effects
Authors:
Kirk Bansak
Abstract:
This study introduces a new approach to power analysis in the context of estimating a local average treatment effect (LATE), where the study subjects exhibit noncompliance with treatment assignment. As a result of distributional complications in the LATE context, compared to the simple ATE context, there is currently no standard method of power analysis for the LATE. Moreover, existing methods and…
▽ More
This study introduces a new approach to power analysis in the context of estimating a local average treatment effect (LATE), where the study subjects exhibit noncompliance with treatment assignment. As a result of distributional complications in the LATE context, compared to the simple ATE context, there is currently no standard method of power analysis for the LATE. Moreover, existing methods and commonly used substitutes - which include instrumental variable (IV), intent-to-treat (ITT), and scaled ATE power analyses - require specifying generally unknown variance terms and/or rely upon strong and unrealistic assumptions, thus providing unreliable guidance on the power of tests of the LATE. This study develops a new approach that uses standardized effect sizes to place bounds on the power for the most commonly used estimator of the LATE, the Wald IV estimator, whereby variance terms and distributional parameters need not be specified nor assumed. Instead, in addition to the effect size, sample size, and error tolerance parameters, the only other parameter that must be specified by the researcher is the compliance rate. Additional conditions can also be introduced to further narrow the bounds on the power calculation. The result is a generalized approach to power analysis in the LATE context that is simple to implement.
△ Less
Submitted 8 April, 2020; v1 submitted 26 October, 2016;
originally announced October 2016.