Search | arXiv e-print repository

Beyond Incompatibility: Trade-offs between Mutually Exclusive Fairness Criteria in Machine Learning and Law

Authors: Meike Zehlike, Alex Loosley, Håkan Jonsson, Emil Wiedemann, Philipp Hacker

Abstract: Fair and trustworthy AI is becoming ever more important in both machine learning and legal domains. One important consequence is that decision makers must seek to guarantee a 'fair', i.e., non-discriminatory, algorithmic decision procedure. However, there are several competing notions of algorithmic fairness that have been shown to be mutually incompatible under realistic factual assumptions. This… ▽ More Fair and trustworthy AI is becoming ever more important in both machine learning and legal domains. One important consequence is that decision makers must seek to guarantee a 'fair', i.e., non-discriminatory, algorithmic decision procedure. However, there are several competing notions of algorithmic fairness that have been shown to be mutually incompatible under realistic factual assumptions. This concerns, for example, the widely used fairness measures of 'calibration within groups' and 'balance for the positive/negative class'. In this paper, we present a novel algorithm (FAir Interpolation Method: FAIM) for continuously interpolating between these three fairness criteria. Thus, an initially unfair prediction can be remedied to, at least partially, meet a desired, weighted combination of the respective fairness conditions. We demonstrate the effectiveness of our algorithm when applied to synthetic data, the COMPAS data set, and a new, real-world data set from the e-commerce sector. Finally, we discuss to what extent FAIM can be harnessed to comply with conflicting legal obligations. The analysis suggests that it may operationalize duties in traditional legal fields, such as credit scoring and criminal justice proceedings, but also for the latest AI regulations put forth in the EU, like the Digital Markets Act and the recently enacted AI Act. △ Less

Submitted 20 December, 2024; v1 submitted 1 December, 2022; originally announced December 2022.

ACM Class: I.2.m; K.4.2; K.5.m

arXiv:2201.12662 [pdf, other]

Fair ranking: a critical review, challenges, and future directions

Authors: Gourab K Patro, Lorenzo Porcaro, Laura Mitchell, Qiuyue Zhang, Meike Zehlike, Nikhil Garg

Abstract: Ranking, recommendation, and retrieval systems are widely used in online platforms and other societal systems, including e-commerce, media-streaming, admissions, gig platforms, and hiring. In the recent past, a large "fair ranking" research literature has been developed around making these systems fair to the individuals, providers, or content that are being ranked. Most of this literature defines… ▽ More Ranking, recommendation, and retrieval systems are widely used in online platforms and other societal systems, including e-commerce, media-streaming, admissions, gig platforms, and hiring. In the recent past, a large "fair ranking" research literature has been developed around making these systems fair to the individuals, providers, or content that are being ranked. Most of this literature defines fairness for a single instance of retrieval, or as a simple additive notion for multiple instances of retrievals over time. This work provides a critical overview of this literature, detailing the often context-specific concerns that such an approach misses: the gap between high ranking placements and true provider utility, spillovers and compounding effects over time, induced strategic incentives, and the effect of statistical uncertainty. We then provide a path forward for a more holistic and impact-oriented fair ranking research agenda, including methodological lessons from other fields and the role of the broader stakeholder community in overcoming data bottlenecks and designing effective regulatory environments. △ Less

Submitted 29 January, 2022; originally announced January 2022.

arXiv:2103.14000 [pdf, other]

doi 10.1145/3533379

doi 10.1145/3533380

Fairness in Ranking: A Survey

Authors: Meike Zehlike, Ke Yang, Julia Stoyanovich

Abstract: In the past few years, there has been much work on incorporating fairness requirements into algorithmic rankers, with contributions coming from the data management, algorithms, information retrieval, and recommender systems communities. In this survey we give a systematic overview of this work, offering a broad perspective that connects formalizations and algorithmic approaches across subfields. A… ▽ More In the past few years, there has been much work on incorporating fairness requirements into algorithmic rankers, with contributions coming from the data management, algorithms, information retrieval, and recommender systems communities. In this survey we give a systematic overview of this work, offering a broad perspective that connects formalizations and algorithmic approaches across subfields. An important contribution of our work is in developing a common narrative around the value frameworks that motivate specific fairness-enhancing interventions in ranking. This allows us to unify the presentation of mitigation objectives and of algorithmic techniques to help meet those objectives or identify trade-offs. In this survey, we describe four classification frameworks for fairness-enhancing interventions, along which we relate the technical methods surveyed in this paper, discuss evaluation datasets, and present technical work on fairness in score-based ranking. Then, we present methods that incorporate fairness in supervised learning, and also give representative examples of recent work on fairness in recommendation and matchmaking systems. We also discuss evaluation frameworks for fair score-based ranking and fair learning-to-rank, and draw a set of recommendations for the evaluation of fair ranking methods. △ Less

Submitted 12 August, 2022; v1 submitted 25 March, 2021; originally announced March 2021.

Comments: 72 pages. ACM CSUR (2022)

ACM Class: I.2.6; I.2.8; H.3.3

arXiv:2012.12795 [pdf, other]

A Note on the Significance Adjustment for FA*IR with Two Protected Groups

Authors: Meike Zehlike, Tom Sühr, Carlos Castillo

Abstract: In this report we provide an improvement of the significance adjustment from the FA*IR algorithm of Zehlike et al., which did not work for very short rankings in combination with a low minimum proportion $p$ for the protected group. We show how the minimum number of protected candidates per ranking position can be calculated exactly and provide a mapping from the continuous space of significance l… ▽ More In this report we provide an improvement of the significance adjustment from the FA*IR algorithm of Zehlike et al., which did not work for very short rankings in combination with a low minimum proportion $p$ for the protected group. We show how the minimum number of protected candidates per ranking position can be calculated exactly and provide a mapping from the continuous space of significance levels ($α$) to a discrete space of tables, which allows us to find $α_c$ using a binary search heuristic. △ Less

Submitted 23 December, 2020; originally announced December 2020.

arXiv:2010.07848 [pdf]

Towards a Flexible Framework for Algorithmic Fairness

Authors: Philip Hacker, Emil Wiedemann, Meike Zehlike

Abstract: Increasingly, scholars seek to integrate legal and technological insights to combat bias in AI systems. In recent years, many different definitions for ensuring non-discrimination in algorithmic decision systems have been put forward. In this paper, we first briefly describe the EU law framework covering cases of algorithmic discrimination. Second, we present an algorithm that harnesses optimal tr… ▽ More Increasingly, scholars seek to integrate legal and technological insights to combat bias in AI systems. In recent years, many different definitions for ensuring non-discrimination in algorithmic decision systems have been put forward. In this paper, we first briefly describe the EU law framework covering cases of algorithmic discrimination. Second, we present an algorithm that harnesses optimal transport to provide a flexible framework to interpolate between different fairness definitions. Third, we show that important normative and legal challenges remain for the implementation of algorithmic fairness interventions in real-world scenarios. Overall, the paper seeks to contribute to the quest for flexible technical frameworks that can be adapted to varying legal and normative fairness constraints. △ Less

Submitted 15 October, 2020; originally announced October 2020.

Comments: forthcoming in INFORMATIK2020

arXiv:1905.13134 [pdf, other]

doi 10.1145/3366424.3383534

FairSearch: A Tool For Fairness in Ranked Search Results

Authors: Meike Zehlike, Tom Sühr, Carlos Castillo, Ivan Kitanovski

Abstract: Ranked search results and recommendations have become the main mechanism by which we find content, products, places, and people online. With hiring, selecting, purchasing, and dating being increasingly mediated by algorithms, rankings may determine career and business opportunities, educational placement, access to benefits, and even social and reproductive success. It is therefore of societal and… ▽ More Ranked search results and recommendations have become the main mechanism by which we find content, products, places, and people online. With hiring, selecting, purchasing, and dating being increasingly mediated by algorithms, rankings may determine career and business opportunities, educational placement, access to benefits, and even social and reproductive success. It is therefore of societal and ethical importance to ask whether search results can demote, marginalize, or exclude individuals of unprivileged groups or promote products with undesired features. In this paper we present FairSearch, the first fair open source search API to provide fairness notions in ranked search results. We implement two algorithms from the fair ranking literature, namely FA*IR (Zehlike et al., 2017) and DELTR (Zehlike and Castillo, 2018) and provide them as stand-alone libraries in Python and Java. Additionally we implement interfaces to Elasticsearch for both algorithms, that use the aforementioned Java libraries and are then provided as Elasticsearch plugins. Elasticsearch is a well-known search engine API based on Apache Lucene. With our plugins we enable search engine developers who wish to ensure fair search results of different styles to easily integrate DELTR and FA*IR into their existing Elasticsearch environment. △ Less

Submitted 23 April, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

Comments: 4 pages, demo paper

ACM Class: H.3.3

Journal ref: Companion Proceedings of the Web Conference 2020 (WWW '20 Companion), April 20--24, 2020, Taipei, Taiwan

arXiv:1805.08716 [pdf, other]

doi 10.1145/3366424.3383534

Reducing Disparate Exposure in Ranking: A Learning To Rank Approach

Authors: Meike Zehlike, Carlos Castillo

Abstract: Ranked search results have become the main mechanism by which we find content, products, places, and people online. Thus their ordering contributes not only to the satisfaction of the searcher, but also to career and business opportunities, educational placement, and even social success of those being ranked. Researchers have become increasingly concerned with systematic biases in data-driven rank… ▽ More Ranked search results have become the main mechanism by which we find content, products, places, and people online. Thus their ordering contributes not only to the satisfaction of the searcher, but also to career and business opportunities, educational placement, and even social success of those being ranked. Researchers have become increasingly concerned with systematic biases in data-driven ranking models, and various post-processing methods have been proposed to mitigate discrimination and inequality of opportunity. This approach, however, has the disadvantage that it still allows an unfair ranking model to be trained. In this paper we explore a new in-processing approach: DELTR, a learning-to-rank framework that addresses potential issues of discrimination and unequal opportunity in rankings at training time. We measure these problems in terms of discrepancies in the average group exposure and design a ranker that optimizes search results in terms of relevance and in terms of reducing such discrepancies. We perform an extensive experimental study showing that being "colorblind" can be among the best or the worst choices from the perspective of relevance and exposure, depending on how much and which kind of bias is present in the training set. We show that our in-processing method performs better in terms of relevance and exposure than a pre-processing and a post-processing method across all tested scenarios. △ Less

Submitted 27 May, 2020; v1 submitted 22 May, 2018; originally announced May 2018.

Comments: 7 pages

ACM Class: H.3.3

arXiv:1712.07924 [pdf, other]

Matching Code and Law: Achieving Algorithmic Fairness with Optimal Transport

Authors: Meike Zehlike, Philipp Hacker, Emil Wiedemann

Abstract: Increasingly, discrimination by algorithms is perceived as a societal and legal problem. As a response, a number of criteria for implementing algorithmic fairness in machine learning have been developed in the literature. This paper proposes the Continuous Fairness Algorithm (CFA$θ$) which enables a continuous interpolation between different fairness definitions. More specifically, we make three m… ▽ More Increasingly, discrimination by algorithms is perceived as a societal and legal problem. As a response, a number of criteria for implementing algorithmic fairness in machine learning have been developed in the literature. This paper proposes the Continuous Fairness Algorithm (CFA$θ$) which enables a continuous interpolation between different fairness definitions. More specifically, we make three main contributions to the existing literature. First, our approach allows the decision maker to continuously vary between specific concepts of individual and group fairness. As a consequence, the algorithm enables the decision maker to adopt intermediate ``worldviews'' on the degree of discrimination encoded in algorithmic processes, adding nuance to the extreme cases of ``we're all equal'' (WAE) and ``what you see is what you get'' (WYSIWYG) proposed so far in the literature. Second, we use optimal transport theory, and specifically the concept of the barycenter, to maximize decision maker utility under the chosen fairness constraints. Third, the algorithm is able to handle cases of intersectionality, i.e., of multi-dimensional discrimination of certain groups on grounds of several criteria. We discuss three main examples (credit applications; college admissions; insurance contracts) and map out the legal and policy implications of our approach. The explicit formalization of the trade-off between individual and group fairness allows this post-processing approach to be tailored to different situational contexts in which one or the other fairness criterion may take precedence. Finally, we evaluate our model experimentally. △ Less

Submitted 24 September, 2019; v1 submitted 21 December, 2017; originally announced December 2017.

Comments: Vastly extended new version, now including computational experiments

arXiv:1706.06368 [pdf, other]

doi 10.1145/3132847.3132938

FA*IR: A Fair Top-k Ranking Algorithm

Authors: Meike Zehlike, Francesco Bonchi, Carlos Castillo, Sara Hajian, Mohamed Megahed, Ricardo Baeza-Yates

Abstract: In this work, we define and solve the Fair Top-k Ranking problem, in which we want to determine a subset of k candidates from a large pool of n >> k candidates, maximizing utility (i.e., select the "best" candidates) subject to group fairness criteria. Our ranked group fairness definition extends group fairness using the standard notion of protected groups and is based on ensuring that the proport… ▽ More In this work, we define and solve the Fair Top-k Ranking problem, in which we want to determine a subset of k candidates from a large pool of n >> k candidates, maximizing utility (i.e., select the "best" candidates) subject to group fairness criteria. Our ranked group fairness definition extends group fairness using the standard notion of protected groups and is based on ensuring that the proportion of protected candidates in every prefix of the top-k ranking remains statistically above or indistinguishable from a given minimum. Utility is operationalized in two ways: (i) every candidate included in the top-$k$ should be more qualified than every candidate not included; and (ii) for every pair of candidates in the top-k, the more qualified candidate should be ranked above. An efficient algorithm is presented for producing the Fair Top-k Ranking, and tested experimentally on existing datasets as well as new datasets released with this paper, showing that our approach yields small distortions with respect to rankings that maximize utility without considering fairness criteria. To the best of our knowledge, this is the first algorithm grounded in statistical tests that can mitigate biases in the representation of an under-represented group along a ranked list. △ Less

Submitted 2 July, 2018; v1 submitted 20 June, 2017; originally announced June 2017.

Comments: In Proceedings of the 26th ACM International Conference on Information and Knowledge Management (CIKM'17). This version corrects an error on Table 4

ACM Class: H.3.3; J.1

Showing 1–9 of 9 results for author: Zehlike, M