Search | arXiv e-print repository

doi 10.1146/annurev-statistics-042720-125902

Prediction-Based Decisions and Fairness: A Catalogue of Choices, Assumptions, and Definitions

Authors: Shira Mitchell, Eric Potash, Solon Barocas, Alexander D'Amour, Kristian Lum

Abstract: A recent flurry of research activity has attempted to quantitatively define "fairness" for decisions based on statistical and machine learning (ML) predictions. The rapid growth of this new field has led to wildly inconsistent terminology and notation, presenting a serious challenge for cataloguing and comparing definitions. This paper attempts to bring much-needed order. First, we explicate the… ▽ More A recent flurry of research activity has attempted to quantitatively define "fairness" for decisions based on statistical and machine learning (ML) predictions. The rapid growth of this new field has led to wildly inconsistent terminology and notation, presenting a serious challenge for cataloguing and comparing definitions. This paper attempts to bring much-needed order. First, we explicate the various choices and assumptions made---often implicitly---to justify the use of prediction-based decisions. Next, we show how such choices and assumptions can raise concerns about fairness and we present a notationally consistent catalogue of fairness definitions from the ML literature. In doing so, we offer a concise reference for thinking through the choices, assumptions, and fairness considerations of prediction-based decision systems. △ Less

Submitted 24 April, 2020; v1 submitted 19 November, 2018; originally announced November 2018.

Journal ref: Annual Review of Statistics and Its Application 2021 8:1

arXiv:1711.07949 [pdf, other]

doi 10.1016/j.econlet.2018.03.012

Randomization Bias in Field Trials to Evaluate Targeting Methods

Authors: Eric Potash

Abstract: This paper studies the evaluation of methods for targeting the allocation of limited resources to a high-risk subpopulation. We consider a randomized controlled trial to measure the difference in efficiency between two targeting methods and show that it is biased. An alternative, survey-based design is shown to be unbiased. Both designs are simulated for the evaluation of a policy to target lead h… ▽ More This paper studies the evaluation of methods for targeting the allocation of limited resources to a high-risk subpopulation. We consider a randomized controlled trial to measure the difference in efficiency between two targeting methods and show that it is biased. An alternative, survey-based design is shown to be unbiased. Both designs are simulated for the evaluation of a policy to target lead hazard investigations using a predictive model. Based on our findings, we advised the Chicago Department of Public Health to use the survey design for their field trial. Our work anticipates further developments in economics that will be important as predictive modeling becomes an increasingly common policy tool. △ Less

Submitted 15 March, 2018; v1 submitted 21 November, 2017; originally announced November 2017.

Journal ref: Economics Letters 167 (2018) 131-135

arXiv:1709.05551 [pdf, ps, other]

Applying Machine Learning Methods to Enhance the Distribution of Social Services in Mexico

Authors: Kris Sankaran, Diego Garcia-Olano, Mobin Javed, Maria Fernanda Alcala-Durand, Adolfo De Unánue, Paul van der Boor, Eric Potash, Roberto Sánchez Avalos, Luis Iñaki Alberro Encinas, Rayid Ghani

Abstract: The Government of Mexico's social development agency, SEDESOL, is responsible for the administration of social services and has the mission of lifting Mexican families out of poverty. One key challenge they face is matching people who have social service needs with the services SEDESOL can provide accurately and efficiently. In this work we describe two specific applications implemented in collabo… ▽ More The Government of Mexico's social development agency, SEDESOL, is responsible for the administration of social services and has the mission of lifting Mexican families out of poverty. One key challenge they face is matching people who have social service needs with the services SEDESOL can provide accurately and efficiently. In this work we describe two specific applications implemented in collaboration with SEDESOL to enhance their distribution of social services. The first problem relates to systematic underreporting on applications for social services, which makes it difficult to identify where to prioritize outreach. Responding that five people reside in a home when only three do is a type of underreporting that could occur while a social worker conducts a home survey with a family to determine their eligibility for services. The second involves approximating multidimensional poverty profiles across households. That is, can we characterize different types of vulnerabilities -- for example, food insecurity and lack of health services -- faced by those in poverty? We detail the problem context, available data, our machine learning formulation, experimental results, and effective feature sets. As far as we are aware this is the first time government data of this scale has been used to combat poverty within Mexico. We found that survey data alone can suggest potential underreporting. Further, we found geographic features useful for housing and service related indicators and transactional data informative for other dimensions of poverty. The results from our machine learning system for estimating poverty profiles will directly help better match 7.4 million individuals to social programs. △ Less

Submitted 16 September, 2017; originally announced September 2017.

Comments: This work was done as part of the 2016 Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship at the University of Chicago

arXiv:1310.4878 [pdf, ps, other]

Euclidean Embeddings and Riemannian Bergman Metrics

Authors: Eric Potash

Abstract: Consider the sum of the first $N$ eigenspaces for the Laplacian on a Riemannian manifold. A basis for this space determines a map to Euclidean space and for $N$ sufficiently large the map is an embedding. In analogy with a fruitful idea of Kähler geometry, we define (Riemannian) Bergman metrics of degree $N$ to be those metrics induced by such embeddings. Our main result is to identify a natural s… ▽ More Consider the sum of the first $N$ eigenspaces for the Laplacian on a Riemannian manifold. A basis for this space determines a map to Euclidean space and for $N$ sufficiently large the map is an embedding. In analogy with a fruitful idea of Kähler geometry, we define (Riemannian) Bergman metrics of degree $N$ to be those metrics induced by such embeddings. Our main result is to identify a natural sequence of Bergman metrics approximating any given Riemannian metric. In particular we have constructed finite dimensional symmetric space approximations to the space of all Riemannian metrics. Moreover the construction induces a Riemannian metric on that infinite dimensional manifold which we compute explicitly. △ Less

Submitted 28 April, 2014; v1 submitted 17 October, 2013; originally announced October 2013.

Showing 1–4 of 4 results for author: Potash, E