Search | arXiv e-print repository

arXiv:2007.06029 [pdf, other]

Ensuring Fairness Beyond the Training Data

Authors: Debmalya Mandal, Samuel Deng, Suman Jana, Jeannette M. Wing, Daniel Hsu

Abstract: We initiate the study of fair classifiers that are robust to perturbations in the training distribution. Despite recent progress, the literature on fairness has largely ignored the design of fair and robust classifiers. In this work, we develop classifiers that are fair not only with respect to the training distribution, but also for a class of distributions that are weighted perturbations of the… ▽ More We initiate the study of fair classifiers that are robust to perturbations in the training distribution. Despite recent progress, the literature on fairness has largely ignored the design of fair and robust classifiers. In this work, we develop classifiers that are fair not only with respect to the training distribution, but also for a class of distributions that are weighted perturbations of the training samples. We formulate a min-max objective function whose goal is to minimize a distributionally robust training loss, and at the same time, find a classifier that is fair with respect to a class of distributions. We first reduce this problem to finding a fair classifier that is robust with respect to the class of distributions. Based on online learning algorithm, we develop an iterative algorithm that provably converges to such a fair and robust solution. Experiments on standard machine learning fairness datasets suggest that, compared to the state-of-the-art fair classifiers, our classifier retains fairness guarantees and test accuracy for a large class of perturbations on the test set. Furthermore, our experiments show that there is an inherent trade-off between fairness robustness and accuracy of such classifiers. △ Less

Submitted 4 November, 2020; v1 submitted 12 July, 2020; originally announced July 2020.

Comments: 18 pages, 3 figures, To appear at NeurIPS-2020

arXiv:2002.06276 [pdf]

Trustworthy AI

Authors: Jeannette M. Wing

Abstract: The promise of AI is huge. AI systems have already achieved good enough performance to be in our streets and in our homes. However, they can be brittle and unfair. For society to reap the benefits of AI systems, society needs to be able to trust them. Inspired by decades of progress in trustworthy computing, we suggest what trustworthy properties would be desired of AI systems. By enumerating a se… ▽ More The promise of AI is huge. AI systems have already achieved good enough performance to be in our streets and in our homes. However, they can be brittle and unfair. For society to reap the benefits of AI systems, society needs to be able to trust them. Inspired by decades of progress in trustworthy computing, we suggest what trustworthy properties would be desired of AI systems. By enumerating a set of new research questions, we explore one approach--formal verification--for ensuring trust in AI. Trustworthy AI ups the ante on both trustworthy computing and formal methods. △ Less

Submitted 14 February, 2020; originally announced February 2020.

Comments: 12 pages

ACM Class: C.4; D.3.1; D.4.6; F.3.1; G.3; I.2

arXiv:2002.05658 [pdf]

Ten Research Challenge Areas in Data Science

Authors: Jeannette M. Wing

Abstract: Although data science builds on knowledge from computer science, mathematics, statistics, and other disciplines, data science is a unique field with many mysteries to unlock: challenging scientific questions and pressing questions of societal importance. This article starts with meta-questions about data science as a discipline and then elaborates on ten ideas for the basis of a research agenda fo… ▽ More Although data science builds on knowledge from computer science, mathematics, statistics, and other disciplines, data science is a unique field with many mysteries to unlock: challenging scientific questions and pressing questions of societal importance. This article starts with meta-questions about data science as a discipline and then elaborates on ten ideas for the basis of a research agenda for data science. △ Less

Submitted 27 January, 2020; originally announced February 2020.

ACM Class: A.0; E.0; G.3; I.2; I.5

arXiv:1510.03311 [pdf, ps, other]

Inverse Privacy

Authors: Yuri Gurevich, Efim Hudis, Jeannette M. Wing

Abstract: An item of your personal information is inversely private if some party has access to it but you do not. We analyze the provenance of inversely private information and its rise to dominance over other kinds of personal information. In a nutshell, the inverse privacy problem is unjustified inaccessibility to you of your inversely private information. We argue that the inverse privacy problem has a… ▽ More An item of your personal information is inversely private if some party has access to it but you do not. We analyze the provenance of inversely private information and its rise to dominance over other kinds of personal information. In a nutshell, the inverse privacy problem is unjustified inaccessibility to you of your inversely private information. We argue that the inverse privacy problem has a market-based solution. △ Less

Submitted 12 October, 2015; originally announced October 2015.

Comments: The article is submitted to a journal

arXiv:1405.2376 [pdf, ps, other]

A Methodology for Information Flow Experiments

Authors: Michael Carl Tschantz, Amit Datta, Anupam Datta, Jeannette M. Wing

Abstract: Information flow analysis has largely ignored the setting where the analyst has neither control over nor a complete model of the analyzed system. We formalize such limited information flow analyses and study an instance of it: detecting the usage of data by websites. We prove that these problems are ones of causal inference. Leveraging this connection, we push beyond traditional information flow a… ▽ More Information flow analysis has largely ignored the setting where the analyst has neither control over nor a complete model of the analyzed system. We formalize such limited information flow analyses and study an instance of it: detecting the usage of data by websites. We prove that these problems are ones of causal inference. Leveraging this connection, we push beyond traditional information flow analysis to provide a systematic methodology based on experimental science and statistical analysis. Our methodology allows us to systematize prior works in the area viewing them as instances of a general approach. Our systematic study leads to practical advice for improving work on detecting data usage, a previously unformalized area. We illustrate these concepts with a series of experiments collecting data on the use of information by websites, which we statistically analyze. △ Less

Submitted 9 May, 2014; originally announced May 2014.

arXiv:1102.4326 [pdf, ps, other]

On the Semantics of Purpose Requirements in Privacy Policies

Authors: Michael Carl Tschantz, Anupam Datta, Jeannette M. Wing

Abstract: Privacy policies often place requirements on the purposes for which a governed entity may use personal information. For example, regulations, such as HIPAA, require that hospital employees use medical information for only certain purposes, such as treatment. Thus, using formal or automated methods for enforcing privacy policies requires a semantics of purpose requirements to determine whether an a… ▽ More Privacy policies often place requirements on the purposes for which a governed entity may use personal information. For example, regulations, such as HIPAA, require that hospital employees use medical information for only certain purposes, such as treatment. Thus, using formal or automated methods for enforcing privacy policies requires a semantics of purpose requirements to determine whether an action is for a purpose or not. We provide such a semantics using a formalism based on planning. We model planning using a modified version of Markov Decision Processes, which exclude redundant actions for a formal definition of redundant. We use the model to formalize when a sequence of actions is only for or not for a purpose. This semantics enables us to provide an algorithm for automating auditing, and to describe formally and compare rigorously previous enforcement methods. △ Less

Submitted 21 February, 2011; originally announced February 2011.

Comments: 34 pages, 3 figures. Tech report, School of Computer Science, Carnegie Mellon University. Submitted to the 24th IEEE Computer Security Foundations Symposium

Report number: CMU-CS-11-102

Showing 1–6 of 6 results for author: Wing, J M