-
Should you use LLMs to simulate opinions? Quality checks for early-stage deliberation
Authors:
Terrence Neumann,
Maria De-Arteaga,
Sina Fazelpour
Abstract:
The emergent capabilities of large language models (LLMs) have sparked interest in assessing their ability to simulate human opinions in a variety of contexts, potentially serving as surrogates for human subjects in opinion surveys. However, previous evaluations of this capability have depended heavily on costly, domain-specific human survey data, and mixed empirical results about LLM effectivenes…
▽ More
The emergent capabilities of large language models (LLMs) have sparked interest in assessing their ability to simulate human opinions in a variety of contexts, potentially serving as surrogates for human subjects in opinion surveys. However, previous evaluations of this capability have depended heavily on costly, domain-specific human survey data, and mixed empirical results about LLM effectiveness create uncertainty for managers about whether investing in this technology is justified in early-stage research. To address these challenges, we introduce a series of quality checks to support early-stage deliberation about the viability of using LLMs for simulating human opinions. These checks emphasize logical constraints, model stability, and alignment with stakeholder expectations of model outputs, thereby reducing dependence on human-generated data in the initial stages of evaluation. We demonstrate the usefulness of the proposed quality control tests in the context of AI-assisted content moderation, an application that both advocates and critics of LLMs' capabilities to simulate human opinion see as a desirable potential use case. None of the tested models passed all quality control checks, revealing several failure modes. We conclude by discussing implications of these failure modes and recommend how organizations can utilize our proposed tests for prompt engineering and in their risk management practices when considering the use of LLMs for opinion simulation. We make our crowdsourced dataset of claims with human and LLM annotations publicly available for future research.
△ Less
Submitted 1 May, 2025; v1 submitted 11 April, 2025;
originally announced April 2025.
-
Perils of Label Indeterminacy: A Case Study on Prediction of Neurological Recovery After Cardiac Arrest
Authors:
Jakob Schoeffer,
Maria De-Arteaga,
Jonathan Elmer
Abstract:
The design of AI systems to assist human decision-making typically requires the availability of labels to train and evaluate supervised models. Frequently, however, these labels are unknown, and different ways of estimating them involve unverifiable assumptions or arbitrary choices. In this work, we introduce the concept of label indeterminacy and derive important implications in high-stakes AI-as…
▽ More
The design of AI systems to assist human decision-making typically requires the availability of labels to train and evaluate supervised models. Frequently, however, these labels are unknown, and different ways of estimating them involve unverifiable assumptions or arbitrary choices. In this work, we introduce the concept of label indeterminacy and derive important implications in high-stakes AI-assisted decision-making. We present an empirical study in a healthcare context, focusing specifically on predicting the recovery of comatose patients after resuscitation from cardiac arrest. Our study shows that label indeterminacy can result in models that perform similarly when evaluated on patients with known labels, but vary drastically in their predictions for patients where labels are unknown. After demonstrating crucial ethical implications of label indeterminacy in this high-stakes context, we discuss takeaways for evaluation, reporting, and design.
△ Less
Submitted 7 May, 2025; v1 submitted 5 April, 2025;
originally announced April 2025.
-
More of the Same: Persistent Representational Harms Under Increased Representation
Authors:
Jennifer Mickel,
Maria De-Arteaga,
Leqi Liu,
Kevin Tian
Abstract:
To recognize and mitigate the harms of generative AI systems, it is crucial to consider who is represented in the outputs of generative AI systems and how people are represented. A critical gap emerges when naively improving who is represented, as this does not imply bias mitigation efforts have been applied to address how people are represented. We critically examined this by investigating gender…
▽ More
To recognize and mitigate the harms of generative AI systems, it is crucial to consider who is represented in the outputs of generative AI systems and how people are represented. A critical gap emerges when naively improving who is represented, as this does not imply bias mitigation efforts have been applied to address how people are represented. We critically examined this by investigating gender representation in occupation across state-of-the-art large language models. We first show evidence suggesting that over time there have been interventions to models altering the resulting gender distribution, and we find that women are more represented than men when models are prompted to generate biographies or personas. We then demonstrate that representational biases persist in how different genders are represented by examining statistically significant word differences across genders. This results in a proliferation of representational harms, stereotypes, and neoliberalism ideals that, despite existing interventions to increase female representation, reinforce existing systems of oppression.
△ Less
Submitted 28 February, 2025;
originally announced March 2025.
-
Using Machine Bias To Measure Human Bias
Authors:
Wanxue Dong,
Maria De-Arteaga,
Maytal Saar-Tsechansky
Abstract:
Biased human decisions have consequential impacts across various domains, yielding unfair treatment of individuals and resulting in suboptimal outcomes for organizations and society. In recognition of this fact, organizations regularly design and deploy interventions aimed at mitigating these biases. However, measuring human decision biases remains an important but elusive task. Organizations are…
▽ More
Biased human decisions have consequential impacts across various domains, yielding unfair treatment of individuals and resulting in suboptimal outcomes for organizations and society. In recognition of this fact, organizations regularly design and deploy interventions aimed at mitigating these biases. However, measuring human decision biases remains an important but elusive task. Organizations are frequently concerned with mistaken decisions disproportionately affecting one group. In practice, however, this is typically not possible to assess due to the scarcity of a gold standard: a label that indicates what the correct decision would have been. In this work, we propose a machine learning-based framework to assess bias in human-generated decisions when gold standard labels are scarce. We provide theoretical guarantees and empirical evidence demonstrating the superiority of our method over existing alternatives. This proposed methodology establishes a foundation for transparency in human decision-making, carrying substantial implications for managerial duties, and offering potential for alleviating algorithmic biases when human decisions are used as labels to train algorithms.
△ Less
Submitted 10 December, 2024; v1 submitted 27 November, 2024;
originally announced November 2024.
-
Fairly Accurate: Optimizing Accuracy Parity in Fair Target-Group Detection
Authors:
Soumyajit Gupta,
Venelin Kovatchev,
Maria De-Arteaga,
Matthew Lease
Abstract:
In algorithmic toxicity detection pipelines, it is important to identify which demographic group(s) are the subject of a post, a task commonly known as \textit{target (group) detection}. While accurate detection is clearly important, we further advocate a fairness objective: to provide equal protection to all groups who may be targeted. To this end, we adopt \textit{Accuracy Parity} (AP) -- balanc…
▽ More
In algorithmic toxicity detection pipelines, it is important to identify which demographic group(s) are the subject of a post, a task commonly known as \textit{target (group) detection}. While accurate detection is clearly important, we further advocate a fairness objective: to provide equal protection to all groups who may be targeted. To this end, we adopt \textit{Accuracy Parity} (AP) -- balanced detection accuracy across groups -- as our fairness objective. However, in order to align model training with our AP fairness objective, we require an equivalent loss function. Moreover, for gradient-based models such as neural networks, this loss function needs to be differentiable. Because no such loss function exists today for AP, we propose \emph{Group Accuracy Parity} (GAP): the first differentiable loss function having a one-on-one mapping to AP. We empirically show that GAP addresses disparate impact on groups for target detection. Furthermore, because a single post often targets multiple groups in practice, we also provide a mathematical extension of GAP to larger multi-group settings, something typically requiring heuristics in prior work. Our findings show that by optimizing AP, GAP better mitigates bias in comparison with other commonly employed loss functions.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Diverse, but Divisive: LLMs Can Exaggerate Gender Differences in Opinion Related to Harms of Misinformation
Authors:
Terrence Neumann,
Sooyong Lee,
Maria De-Arteaga,
Sina Fazelpour,
Matthew Lease
Abstract:
The pervasive spread of misinformation and disinformation poses a significant threat to society. Professional fact-checkers play a key role in addressing this threat, but the vast scale of the problem forces them to prioritize their limited resources. This prioritization may consider a range of factors, such as varying risks of harm posed to specific groups of people. In this work, we investigate…
▽ More
The pervasive spread of misinformation and disinformation poses a significant threat to society. Professional fact-checkers play a key role in addressing this threat, but the vast scale of the problem forces them to prioritize their limited resources. This prioritization may consider a range of factors, such as varying risks of harm posed to specific groups of people. In this work, we investigate potential implications of using a large language model (LLM) to facilitate such prioritization. Because fact-checking impacts a wide range of diverse segments of society, it is important that diverse views are represented in the claim prioritization process. This paper examines whether a LLM can reflect the views of various groups when assessing the harms of misinformation, focusing on gender as a primary variable. We pose two central questions: (1) To what extent do prompts with explicit gender references reflect gender differences in opinion in the United States on topics of social relevance? and (2) To what extent do gender-neutral prompts align with gendered viewpoints on those topics? To analyze these questions, we present the TopicMisinfo dataset, containing 160 fact-checked claims from diverse topics, supplemented by nearly 1600 human annotations with subjective perceptions and annotator demographics. Analyzing responses to gender-specific and neutral prompts, we find that GPT 3.5-Turbo reflects empirically observed gender differences in opinion but amplifies the extent of these differences. These findings illuminate AI's complex role in moderating online communication, with implications for fact-checkers, algorithm designers, and the use of crowd-workers as annotators. We also release the TopicMisinfo dataset to support continuing research in the community.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
A Critical Survey on Fairness Benefits of Explainable AI
Authors:
Luca Deck,
Jakob Schoeffer,
Maria De-Arteaga,
Niklas Kühl
Abstract:
In this critical survey, we analyze typical claims on the relationship between explainable AI (XAI) and fairness to disentangle the multidimensional relationship between these two concepts. Based on a systematic literature review and a subsequent qualitative content analysis, we identify seven archetypal claims from 175 scientific articles on the alleged fairness benefits of XAI. We present crucia…
▽ More
In this critical survey, we analyze typical claims on the relationship between explainable AI (XAI) and fairness to disentangle the multidimensional relationship between these two concepts. Based on a systematic literature review and a subsequent qualitative content analysis, we identify seven archetypal claims from 175 scientific articles on the alleged fairness benefits of XAI. We present crucial caveats with respect to these claims and provide an entry point for future discussions around the potentials and limitations of XAI for specific fairness desiderata. Importantly, we notice that claims are often (i) vague and simplistic, (ii) lacking normative grounding, or (iii) poorly aligned with the actual capabilities of XAI. We suggest to conceive XAI not as an ethical panacea but as one of many tools to approach the multidimensional, sociotechnical challenge of algorithmic fairness. Moreover, when making a claim about XAI and fairness, we emphasize the need to be more specific about what kind of XAI method is used, which fairness desideratum it refers to, how exactly it enables fairness, and who is the stakeholder that benefits from XAI.
△ Less
Submitted 7 May, 2024; v1 submitted 15 October, 2023;
originally announced October 2023.
-
Mitigating Label Bias via Decoupled Confident Learning
Authors:
Yunyi Li,
Maria De-Arteaga,
Maytal Saar-Tsechansky
Abstract:
Growing concerns regarding algorithmic fairness have led to a surge in methodologies to mitigate algorithmic bias. However, such methodologies largely assume that observed labels in training data are correct. This is problematic because bias in labels is pervasive across important domains, including healthcare, hiring, and content moderation. In particular, human-generated labels are prone to enco…
▽ More
Growing concerns regarding algorithmic fairness have led to a surge in methodologies to mitigate algorithmic bias. However, such methodologies largely assume that observed labels in training data are correct. This is problematic because bias in labels is pervasive across important domains, including healthcare, hiring, and content moderation. In particular, human-generated labels are prone to encoding societal biases. While the presence of labeling bias has been discussed conceptually, there is a lack of methodologies to address this problem. We propose a pruning method -- Decoupled Confident Learning (DeCoLe) -- specifically designed to mitigate label bias. After illustrating its performance on a synthetic dataset, we apply DeCoLe in the context of hate speech detection, where label bias has been recognized as an important challenge, and show that it successfully identifies biased labels and outperforms competing approaches.
△ Less
Submitted 29 September, 2023; v1 submitted 17 July, 2023;
originally announced July 2023.
-
Human-Centered Responsible Artificial Intelligence: Current & Future Trends
Authors:
Mohammad Tahaei,
Marios Constantinides,
Daniele Quercia,
Sean Kennedy,
Michael Muller,
Simone Stumpf,
Q. Vera Liao,
Ricardo Baeza-Yates,
Lora Aroyo,
Jess Holbrook,
Ewa Luger,
Michael Madaio,
Ilana Golbin Blumenfeld,
Maria De-Arteaga,
Jessica Vitak,
Alexandra Olteanu
Abstract:
In recent years, the CHI community has seen significant growth in research on Human-Centered Responsible Artificial Intelligence. While different research communities may use different terminology to discuss similar topics, all of this work is ultimately aimed at developing AI that benefits humanity while being grounded in human rights and ethics, and reducing the potential harms of AI. In this sp…
▽ More
In recent years, the CHI community has seen significant growth in research on Human-Centered Responsible Artificial Intelligence. While different research communities may use different terminology to discuss similar topics, all of this work is ultimately aimed at developing AI that benefits humanity while being grounded in human rights and ethics, and reducing the potential harms of AI. In this special interest group, we aim to bring together researchers from academia and industry interested in these topics to map current and future research trends to advance this important area of research by fostering collaboration and sharing ideas.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
Same Same, But Different: Conditional Multi-Task Learning for Demographic-Specific Toxicity Detection
Authors:
Soumyajit Gupta,
Sooyong Lee,
Maria De-Arteaga,
Matthew Lease
Abstract:
Algorithmic bias often arises as a result of differential subgroup validity, in which predictive relationships vary across groups. For example, in toxic language detection, comments targeting different demographic groups can vary markedly across groups. In such settings, trained models can be dominated by the relationships that best fit the majority group, leading to disparate performance. We prop…
▽ More
Algorithmic bias often arises as a result of differential subgroup validity, in which predictive relationships vary across groups. For example, in toxic language detection, comments targeting different demographic groups can vary markedly across groups. In such settings, trained models can be dominated by the relationships that best fit the majority group, leading to disparate performance. We propose framing toxicity detection as multi-task learning (MTL), allowing a model to specialize on the relationships that are relevant to each demographic group while also leveraging shared properties across groups. With toxicity detection, each task corresponds to identifying toxicity against a particular demographic group. However, traditional MTL requires labels for all tasks to be present for every data point. To address this, we propose Conditional MTL (CondMTL), wherein only training examples relevant to the given demographic group are considered by the loss function. This lets us learn group specific representations in each branch which are not cross contaminated by irrelevant labels. Results on synthetic and real data show that using CondMTL improves predictive recall over various baselines in general and for the minority demographic group in particular, while having similar overall accuracy.
△ Less
Submitted 6 March, 2023; v1 submitted 14 February, 2023;
originally announced February 2023.
-
Learning Complementary Policies for Human-AI Teams
Authors:
Ruijiang Gao,
Maytal Saar-Tsechansky,
Maria De-Arteaga,
Ligong Han,
Wei Sun,
Min Kyung Lee,
Matthew Lease
Abstract:
Human-AI complementarity is important when neither the algorithm nor the human yields dominant performance across all instances in a given context. Recent work that explored human-AI collaboration has considered decisions that correspond to classification tasks. However, in many important contexts where humans can benefit from AI complementarity, humans undertake course of action. In this paper, w…
▽ More
Human-AI complementarity is important when neither the algorithm nor the human yields dominant performance across all instances in a given context. Recent work that explored human-AI collaboration has considered decisions that correspond to classification tasks. However, in many important contexts where humans can benefit from AI complementarity, humans undertake course of action. In this paper, we propose a framework for a novel human-AI collaboration for selecting advantageous course of action, which we refer to as Learning Complementary Policy for Human-AI teams (\textsc{lcp-hai}). Our solution aims to exploit the human-AI complementarity to maximize decision rewards by learning both an algorithmic policy that aims to complement humans by a routing model that defers decisions to either a human or the AI to leverage the resulting complementarity. We then extend our approach to leverage opportunities and mitigate risks that arise in important contexts in practice: 1) when a team is composed of multiple humans with differential and potentially complementary abilities, 2) when the observational data includes consistent deterministic actions, and 3) when the covariate distribution of future decisions differ from that in the historical data. We demonstrate the effectiveness of our proposed methods using data on real human responses and semi-synthetic, and find that our methods offer reliable and advantageous performance across setting, and that it is superior to when either the algorithm or the AI make decisions on their own. We also find that the extensions we propose effectively improve the robustness of the human-AI collaboration performance in the presence of different challenging settings.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
Explanations, Fairness, and Appropriate Reliance in Human-AI Decision-Making
Authors:
Jakob Schoeffer,
Maria De-Arteaga,
Niklas Kuehl
Abstract:
In this work, we study the effects of feature-based explanations on distributive fairness of AI-assisted decisions, specifically focusing on the task of predicting occupations from short textual bios. We also investigate how any effects are mediated by humans' fairness perceptions and their reliance on AI recommendations. Our findings show that explanations influence fairness perceptions, which, i…
▽ More
In this work, we study the effects of feature-based explanations on distributive fairness of AI-assisted decisions, specifically focusing on the task of predicting occupations from short textual bios. We also investigate how any effects are mediated by humans' fairness perceptions and their reliance on AI recommendations. Our findings show that explanations influence fairness perceptions, which, in turn, relate to humans' tendency to adhere to AI recommendations. However, we see that such explanations do not enable humans to discern correct and incorrect AI recommendations. Instead, we show that they may affect reliance irrespective of the correctness of AI recommendations. Depending on which features an explanation highlights, this can foster or hinder distributive fairness: when explanations highlight features that are task-irrelevant and evidently associated with the sensitive attribute, this prompts overrides that counter AI recommendations that align with gender stereotypes. Meanwhile, if explanations appear task-relevant, this induces reliance behavior that reinforces stereotype-aligned errors. These results imply that feature-based explanations are not a reliable mechanism to improve distributive fairness.
△ Less
Submitted 18 March, 2024; v1 submitted 23 September, 2022;
originally announced September 2022.
-
Imputation Strategies Under Clinical Presence: Impact on Algorithmic Fairness
Authors:
Vincent Jeanselme,
Maria De-Arteaga,
Zhe Zhang,
Jessica Barrett,
Brian Tom
Abstract:
Machine learning risks reinforcing biases present in data and, as we argue in this work, in what is absent from data. In healthcare, societal and decision biases shape patterns in missing data, yet the algorithmic fairness implications of group-specific missingness are poorly understood. The way we address missingness in healthcare can have detrimental impacts on downstream algorithmic fairness. O…
▽ More
Machine learning risks reinforcing biases present in data and, as we argue in this work, in what is absent from data. In healthcare, societal and decision biases shape patterns in missing data, yet the algorithmic fairness implications of group-specific missingness are poorly understood. The way we address missingness in healthcare can have detrimental impacts on downstream algorithmic fairness. Our work questions current recommendations and practices aimed at handling missing data with a focus on their effect on algorithmic fairness, and offers a path forward. Specifically, we consider the theoretical underpinnings of existing recommendations as well as their empirical predictive performance and corresponding algorithmic fairness measured through subgroup performances. Our results show that current practices for handling missingness lack principled foundations, are disconnected from the realities of missingness mechanisms in healthcare, and can be counterproductive. For example, we show that favouring group-specific imputation strategy can be misguided and exacerbate prediction disparities. We then build on our findings to propose a framework for empirically guiding imputation choices, and an accompanying reporting framework. Our work constitutes an important contribution to recent efforts by regulators and practitioners to grapple with the realities of real-world data, and to foster the responsible and transparent deployment of machine learning systems. We demonstrate the practical utility of the proposed framework through experimentation on widely used datasets, where we show how the proposed framework can guide the selection of imputation strategies, allowing us to choose among strategies that yield equal overall predictive performance but present different algorithmic fairness properties.
△ Less
Submitted 17 March, 2025; v1 submitted 13 August, 2022;
originally announced August 2022.
-
Toward Supporting Perceptual Complementarity in Human-AI Collaboration via Reflection on Unobservables
Authors:
Kenneth Holstein,
Maria De-Arteaga,
Lakshmi Tumati,
Yanghuidi Cheng
Abstract:
In many real world contexts, successful human-AI collaboration requires humans to productively integrate complementary sources of information into AI-informed decisions. However, in practice human decision-makers often lack understanding of what information an AI model has access to in relation to themselves. There are few available guidelines regarding how to effectively communicate about unobser…
▽ More
In many real world contexts, successful human-AI collaboration requires humans to productively integrate complementary sources of information into AI-informed decisions. However, in practice human decision-makers often lack understanding of what information an AI model has access to in relation to themselves. There are few available guidelines regarding how to effectively communicate about unobservables: features that may influence the outcome, but which are unavailable to the model. In this work, we conducted an online experiment to understand whether and how explicitly communicating potentially relevant unobservables influences how people integrate model outputs and unobservables when making predictions. Our findings indicate that presenting prompts about unobservables can change how humans integrate model outputs and unobservables, but do not necessarily lead to improved performance. Furthermore, the impacts of these prompts can vary depending on decision-makers' prior domain expertise. We conclude by discussing implications for future research and design of AI-based decision support tools.
△ Less
Submitted 26 January, 2023; v1 submitted 27 July, 2022;
originally announced July 2022.
-
Algorithmic Fairness in Business Analytics: Directions for Research and Practice
Authors:
Maria De-Arteaga,
Stefan Feuerriegel,
Maytal Saar-Tsechansky
Abstract:
The extensive adoption of business analytics (BA) has brought financial gains and increased efficiencies. However, these advances have simultaneously drawn attention to rising legal and ethical challenges when BA inform decisions with fairness implications. As a response to these concerns, the emerging study of algorithmic fairness deals with algorithmic outputs that may result in disparate outcom…
▽ More
The extensive adoption of business analytics (BA) has brought financial gains and increased efficiencies. However, these advances have simultaneously drawn attention to rising legal and ethical challenges when BA inform decisions with fairness implications. As a response to these concerns, the emerging study of algorithmic fairness deals with algorithmic outputs that may result in disparate outcomes or other forms of injustices for subgroups of the population, especially those who have been historically marginalized. Fairness is relevant on the basis of legal compliance, social responsibility, and utility; if not adequately and systematically addressed, unfair BA systems may lead to societal harms and may also threaten an organization's own survival, its competitiveness, and overall performance. This paper offers a forward-looking, BA-focused review of algorithmic fairness. We first review the state-of-the-art research on sources and measures of bias, as well as bias mitigation algorithms. We then provide a detailed discussion of the utility-fairness relationship, emphasizing that the frequent assumption of a trade-off between these two constructs is often mistaken or short-sighted. Finally, we chart a path forward by identifying opportunities for business scholars to address impactful, open challenges that are key to the effective and responsible deployment of BA.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
More Data Can Lead Us Astray: Active Data Acquisition in the Presence of Label Bias
Authors:
Yunyi Li,
Maria De-Arteaga,
Maytal Saar-Tsechansky
Abstract:
An increased awareness concerning risks of algorithmic bias has driven a surge of efforts around bias mitigation strategies. A vast majority of the proposed approaches fall under one of two categories: (1) imposing algorithmic fairness constraints on predictive models, and (2) collecting additional training samples. Most recently and at the intersection of these two categories, methods that propos…
▽ More
An increased awareness concerning risks of algorithmic bias has driven a surge of efforts around bias mitigation strategies. A vast majority of the proposed approaches fall under one of two categories: (1) imposing algorithmic fairness constraints on predictive models, and (2) collecting additional training samples. Most recently and at the intersection of these two categories, methods that propose active learning under fairness constraints have been developed. However, proposed bias mitigation strategies typically overlook the bias presented in the observed labels. In this work, we study fairness considerations of active data collection strategies in the presence of label bias. We first present an overview of different types of label bias in the context of supervised learning systems. We then empirically show that, when overlooking label bias, collecting more data can aggravate bias, and imposing fairness constraints that rely on the observed labels in the data collection process may not address the problem. Our results illustrate the unintended consequences of deploying a model that attempts to mitigate a single type of bias while neglecting others, emphasizing the importance of explicitly differentiating between the types of bias that fairness-aware algorithms aim to address, and highlighting the risks of neglecting label bias during data collection.
△ Less
Submitted 15 July, 2022;
originally announced July 2022.
-
Doubting AI Predictions: Influence-Driven Second Opinion Recommendation
Authors:
Maria De-Arteaga,
Alexandra Chouldechova,
Artur Dubrawski
Abstract:
Effective human-AI collaboration requires a system design that provides humans with meaningful ways to make sense of and critically evaluate algorithmic recommendations. In this paper, we propose a way to augment human-AI collaboration by building on a common organizational practice: identifying experts who are likely to provide complementary opinions. When machine learning algorithms are trained…
▽ More
Effective human-AI collaboration requires a system design that provides humans with meaningful ways to make sense of and critically evaluate algorithmic recommendations. In this paper, we propose a way to augment human-AI collaboration by building on a common organizational practice: identifying experts who are likely to provide complementary opinions. When machine learning algorithms are trained to predict human-generated assessments, experts' rich multitude of perspectives is frequently lost in monolithic algorithmic recommendations. The proposed approach aims to leverage productive disagreement by (1) identifying whether some experts are likely to disagree with an algorithmic assessment and, if so, (2) recommend an expert to request a second opinion from.
△ Less
Submitted 29 April, 2022;
originally announced May 2022.
-
Justice in Misinformation Detection Systems: An Analysis of Algorithms, Stakeholders, and Potential Harms
Authors:
Terrence Neumann,
Maria De-Arteaga,
Sina Fazelpour
Abstract:
Faced with the scale and surge of misinformation on social media, many platforms and fact-checking organizations have turned to algorithms for automating key parts of misinformation detection pipelines. While offering a promising solution to the challenge of scale, the ethical and societal risks associated with algorithmic misinformation detection are not well-understood. In this paper, we employ…
▽ More
Faced with the scale and surge of misinformation on social media, many platforms and fact-checking organizations have turned to algorithms for automating key parts of misinformation detection pipelines. While offering a promising solution to the challenge of scale, the ethical and societal risks associated with algorithmic misinformation detection are not well-understood. In this paper, we employ and extend upon the notion of informational justice to develop a framework for explicating issues of justice relating to representation, participation, distribution of benefits and burdens, and credibility in the misinformation detection pipeline. Drawing on the framework: (1) we show how injustices materialize for stakeholders across three algorithmic stages in the pipeline; (2) we suggest empirical measures for assessing these injustices; and (3) we identify potential sources of these harms. This framework should help researchers, policymakers, and practitioners reason about potential harms or risks associated with these algorithms and provide conceptual guidance for the design of algorithmic fairness audits in this domain.
△ Less
Submitted 29 April, 2022; v1 submitted 28 April, 2022;
originally announced April 2022.
-
On the Relationship Between Explanations, Fairness Perceptions, and Decisions
Authors:
Jakob Schoeffer,
Maria De-Arteaga,
Niklas Kuehl
Abstract:
It is known that recommendations of AI-based systems can be incorrect or unfair. Hence, it is often proposed that a human be the final decision-maker. Prior work has argued that explanations are an essential pathway to help human decision-makers enhance decision quality and mitigate bias, i.e., facilitate human-AI complementarity. For these benefits to materialize, explanations should enable human…
▽ More
It is known that recommendations of AI-based systems can be incorrect or unfair. Hence, it is often proposed that a human be the final decision-maker. Prior work has argued that explanations are an essential pathway to help human decision-makers enhance decision quality and mitigate bias, i.e., facilitate human-AI complementarity. For these benefits to materialize, explanations should enable humans to appropriately rely on AI recommendations and override the algorithmic recommendation when necessary to increase distributive fairness of decisions. The literature, however, does not provide conclusive empirical evidence as to whether explanations enable such complementarity in practice. In this work, we (a) provide a conceptual framework to articulate the relationships between explanations, fairness perceptions, reliance, and distributive fairness, (b) apply it to understand (seemingly) contradictory research findings at the intersection of explanations and fairness, and (c) derive cohesive implications for the formulation of research questions and the design of experiments.
△ Less
Submitted 6 May, 2022; v1 submitted 27 April, 2022;
originally announced April 2022.
-
Finding Pareto Trade-offs in Fair and Accurate Detection of Toxic Speech
Authors:
Soumyajit Gupta,
Venelin Kovatchev,
Anubrata Das,
Maria De-Arteaga,
Matthew Lease
Abstract:
Optimizing NLP models for fairness poses many challenges. Lack of differentiable fairness measures prevents gradient-based loss training or requires surrogate losses that diverge from the true metric of interest. In addition, competing objectives (e.g., accuracy vs. fairness) often require making trade-offs based on stakeholder preferences, but stakeholders may not know their preferences before se…
▽ More
Optimizing NLP models for fairness poses many challenges. Lack of differentiable fairness measures prevents gradient-based loss training or requires surrogate losses that diverge from the true metric of interest. In addition, competing objectives (e.g., accuracy vs. fairness) often require making trade-offs based on stakeholder preferences, but stakeholders may not know their preferences before seeing system performance under different trade-off settings. To address these challenges, we begin by formulating a differentiable version of a popular fairness measure, Accuracy Parity, to provide balanced accuracy across demographic groups. Next, we show how model-agnostic, HyperNetwork optimization can efficiently train arbitrary NLP model architectures to learn Pareto-optimal trade-offs between competing metrics. Focusing on the task of toxic language detection, we show the generality and efficacy of our methods across two datasets, three neural architectures, and three fairness losses.
△ Less
Submitted 9 April, 2025; v1 submitted 15 April, 2022;
originally announced April 2022.
-
Social Norm Bias: Residual Harms of Fairness-Aware Algorithms
Authors:
Myra Cheng,
Maria De-Arteaga,
Lester Mackey,
Adam Tauman Kalai
Abstract:
Many modern machine learning algorithms mitigate bias by enforcing fairness constraints across coarsely-defined groups related to a sensitive attribute like gender or race. However, these algorithms seldom account for within-group heterogeneity and biases that may disproportionately affect some members of a group. In this work, we characterize Social Norm Bias (SNoB), a subtle but consequential ty…
▽ More
Many modern machine learning algorithms mitigate bias by enforcing fairness constraints across coarsely-defined groups related to a sensitive attribute like gender or race. However, these algorithms seldom account for within-group heterogeneity and biases that may disproportionately affect some members of a group. In this work, we characterize Social Norm Bias (SNoB), a subtle but consequential type of algorithmic discrimination that may be exhibited by machine learning models, even when these systems achieve group fairness objectives. We study this issue through the lens of gender bias in occupation classification. We quantify SNoB by measuring how an algorithm's predictions are associated with conformity to inferred gender norms. When predicting if an individual belongs to a male-dominated occupation, this framework reveals that "fair" classifiers still favor biographies written in ways that align with inferred masculine norms. We compare SNoB across algorithmic fairness methods and show that it is frequently a residual bias, and post-processing approaches do not mitigate this type of bias at all.
△ Less
Submitted 10 August, 2022; v1 submitted 25 August, 2021;
originally announced August 2021.
-
Diversity in Sociotechnical Machine Learning Systems
Authors:
Sina Fazelpour,
Maria De-Arteaga
Abstract:
There has been a surge of recent interest in sociocultural diversity in machine learning (ML) research, with researchers (i) examining the benefits of diversity as an organizational solution for alleviating problems with algorithmic bias, and (ii) proposing measures and methods for implementing diversity as a design desideratum in the construction of predictive algorithms. Currently, however, ther…
▽ More
There has been a surge of recent interest in sociocultural diversity in machine learning (ML) research, with researchers (i) examining the benefits of diversity as an organizational solution for alleviating problems with algorithmic bias, and (ii) proposing measures and methods for implementing diversity as a design desideratum in the construction of predictive algorithms. Currently, however, there is a gap between discussions of measures and benefits of diversity in ML, on the one hand, and the broader research on the underlying concepts of diversity and the precise mechanisms of its functional benefits, on the other. This gap is problematic because diversity is not a monolithic concept. Rather, different concepts of diversity are based on distinct rationales that should inform how we measure diversity in a given context. Similarly, the lack of specificity about the precise mechanisms underpinning diversity's potential benefits can result in uninformative generalities, invalid experimental designs, and illicit interpretations of findings. In this work, we draw on research in philosophy, psychology, and social and organizational sciences to make three contributions: First, we introduce a taxonomy of different diversity concepts from philosophy of science, and explicate the distinct epistemic and political rationales underlying these concepts. Second, we provide an overview of mechanisms by which diversity can benefit group performance. Third, we situate these taxonomies--of concepts and mechanisms--in the lifecycle of sociotechnical ML systems and make a case for their usefulness in fair and accountable ML. We do so by illustrating how they clarify the discourse around diversity in the context of ML systems, promote the formulation of more precise research questions about diversity's impact, and provide conceptual tools to further advance research and practice.
△ Less
Submitted 19 July, 2021;
originally announced July 2021.
-
Human-AI Collaboration with Bandit Feedback
Authors:
Ruijiang Gao,
Maytal Saar-Tsechansky,
Maria De-Arteaga,
Ligong Han,
Min Kyung Lee,
Matthew Lease
Abstract:
Human-machine complementarity is important when neither the algorithm nor the human yield dominant performance across all instances in a given domain. Most research on algorithmic decision-making solely centers on the algorithm's performance, while recent work that explores human-machine collaboration has framed the decision-making problems as classification tasks. In this paper, we first propose…
▽ More
Human-machine complementarity is important when neither the algorithm nor the human yield dominant performance across all instances in a given domain. Most research on algorithmic decision-making solely centers on the algorithm's performance, while recent work that explores human-machine collaboration has framed the decision-making problems as classification tasks. In this paper, we first propose and then develop a solution for a novel human-machine collaboration problem in a bandit feedback setting. Our solution aims to exploit the human-machine complementarity to maximize decision rewards. We then extend our approach to settings with multiple human decision makers. We demonstrate the effectiveness of our proposed methods using both synthetic and real human responses, and find that our methods outperform both the algorithm and the human when they each make decisions on their own. We also show how personalized routing in the presence of multiple human decision-makers can further improve the human-machine team performance.
△ Less
Submitted 21 May, 2021;
originally announced May 2021.
-
The effect of differential victim crime reporting on predictive policing systems
Authors:
Nil-Jana Akpinar,
Maria De-Arteaga,
Alexandra Chouldechova
Abstract:
Police departments around the world have been experimenting with forms of place-based data-driven proactive policing for over two decades. Modern incarnations of such systems are commonly known as hot spot predictive policing. These systems predict where future crime is likely to concentrate such that police can allocate patrols to these areas and deter crime before it occurs. Previous research on…
▽ More
Police departments around the world have been experimenting with forms of place-based data-driven proactive policing for over two decades. Modern incarnations of such systems are commonly known as hot spot predictive policing. These systems predict where future crime is likely to concentrate such that police can allocate patrols to these areas and deter crime before it occurs. Previous research on fairness in predictive policing has concentrated on the feedback loops which occur when models are trained on discovered crime data, but has limited implications for models trained on victim crime reporting data. We demonstrate how differential victim crime reporting rates across geographical areas can lead to outcome disparities in common crime hot spot prediction models. Our analysis is based on a simulation patterned after district-level victimization and crime reporting survey data for Bogotá, Colombia. Our results suggest that differential crime reporting rates can lead to a displacement of predicted hotspots from high crime but low reporting areas to high or medium crime and high reporting areas. This may lead to misallocations both in the form of over-policing and under-policing.
△ Less
Submitted 4 February, 2021; v1 submitted 29 January, 2021;
originally announced February 2021.
-
Leveraging Expert Consistency to Improve Algorithmic Decision Support
Authors:
Maria De-Arteaga,
Vincent Jeanselme,
Artur Dubrawski,
Alexandra Chouldechova
Abstract:
Machine learning (ML) is increasingly being used to support high-stakes decisions. However, there is frequently a construct gap: a gap between the construct of interest to the decision-making task and what is captured in proxies used as labels to train ML models. As a result, ML models may fail to capture important dimensions of decision criteria, hampering their utility for decision support. Thus…
▽ More
Machine learning (ML) is increasingly being used to support high-stakes decisions. However, there is frequently a construct gap: a gap between the construct of interest to the decision-making task and what is captured in proxies used as labels to train ML models. As a result, ML models may fail to capture important dimensions of decision criteria, hampering their utility for decision support. Thus, an essential step in the design of ML systems for decision support is selecting a target label among available proxies. In this work, we explore the use of historical expert decisions as a rich -- yet also imperfect -- source of information that can be combined with observed outcomes to narrow the construct gap. We argue that managers and system designers may be interested in learning from experts in instances where they exhibit consistency with each other, while learning from observed outcomes otherwise. We develop a methodology to enable this goal using information that is commonly available in organizational information systems. This involves two core steps. First, we propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert. Second, we introduce a label amalgamation approach that allows ML models to simultaneously learn from expert decisions and observed outcomes. Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap, yielding better predictive performance than learning from either observed outcomes or expert decisions alone.
△ Less
Submitted 3 June, 2024; v1 submitted 24 January, 2021;
originally announced January 2021.
-
A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores
Authors:
Maria De-Arteaga,
Riccardo Fogliato,
Alexandra Chouldechova
Abstract:
The increased use of algorithmic predictions in sensitive domains has been accompanied by both enthusiasm and concern. To understand the opportunities and risks of these technologies, it is key to study how experts alter their decisions when using such tools. In this paper, we study the adoption of an algorithmic tool used to assist child maltreatment hotline screening decisions. We focus on the q…
▽ More
The increased use of algorithmic predictions in sensitive domains has been accompanied by both enthusiasm and concern. To understand the opportunities and risks of these technologies, it is key to study how experts alter their decisions when using such tools. In this paper, we study the adoption of an algorithmic tool used to assist child maltreatment hotline screening decisions. We focus on the question: Are humans capable of identifying cases in which the machine is wrong, and of overriding those recommendations? We first show that humans do alter their behavior when the tool is deployed. Then, we show that humans are less likely to adhere to the machine's recommendation when the score displayed is an incorrect estimate of risk, even when overriding the recommendation requires supervisory approval. These results highlight the risks of full automation and the importance of designing decision pipelines that provide humans with autonomy.
△ Less
Submitted 20 February, 2020; v1 submitted 19 February, 2020;
originally announced February 2020.
-
Proceedings of NeurIPS 2019 Workshop on Machine Learning for the Developing World: Challenges and Risks of ML4D
Authors:
Maria De-Arteaga,
Tejumade Afonja,
Amanda Coston
Abstract:
This is the proceedings of the 3rd ML4D workshop which was help in Vancouver, Canada on December 13, 2019 as part of the Neural Information Processing Systems conference.
This is the proceedings of the 3rd ML4D workshop which was help in Vancouver, Canada on December 13, 2019 as part of the Neural Information Processing Systems conference.
△ Less
Submitted 10 April, 2020; v1 submitted 1 January, 2020;
originally announced January 2020.
-
Killings of social leaders in the Colombian post-conflict: Data analysis for investigative journalism
Authors:
Maria De-Arteaga,
Benedikt Boecking
Abstract:
After the peace agreement of 2016 with FARC, the killings of social leaders have emerged as an important post-conflict challenge for Colombia. We present a data analysis based on official records obtained from the Colombian General Attorney's Office spanning the time period from 2012 to 2017. The results of the analysis show a drastic increase in the officially recorded number of killings of democ…
▽ More
After the peace agreement of 2016 with FARC, the killings of social leaders have emerged as an important post-conflict challenge for Colombia. We present a data analysis based on official records obtained from the Colombian General Attorney's Office spanning the time period from 2012 to 2017. The results of the analysis show a drastic increase in the officially recorded number of killings of democratically elected leaders of community organizations, in particular those belonging to Juntas de Acción Comunal [Community Action Boards]. These are important entities that have been part of the Colombian democratic apparatus since 1958, and enable communities to advocate for their needs. We also describe how the data analysis guided a journalistic investigation that was motivated by the Colombian government's denial of the systematic nature of social leaders killings.
△ Less
Submitted 19 June, 2019;
originally announced June 2019.
-
What's in a Name? Reducing Bias in Bios without Access to Protected Attributes
Authors:
Alexey Romanov,
Maria De-Arteaga,
Hanna Wallach,
Jennifer Chayes,
Christian Borgs,
Alexandra Chouldechova,
Sahin Geyik,
Krishnaram Kenthapadi,
Anna Rumshisky,
Adam Tauman Kalai
Abstract:
There is a growing body of work that proposes methods for mitigating bias in machine learning systems. These methods typically rely on access to protected attributes such as race, gender, or age. However, this raises two significant challenges: (1) protected attributes may not be available or it may not be legal to use them, and (2) it is often desirable to simultaneously consider multiple protect…
▽ More
There is a growing body of work that proposes methods for mitigating bias in machine learning systems. These methods typically rely on access to protected attributes such as race, gender, or age. However, this raises two significant challenges: (1) protected attributes may not be available or it may not be legal to use them, and (2) it is often desirable to simultaneously consider multiple protected attributes, as well as their intersections. In the context of mitigating bias in occupation classification, we propose a method for discouraging correlation between the predicted probability of an individual's true occupation and a word embedding of their name. This method leverages the societal biases that are encoded in word embeddings, eliminating the need for access to protected attributes. Crucially, it only requires access to individuals' names at training time and not at deployment time. We evaluate two variations of our proposed method using a large-scale dataset of online biographies. We find that both variations simultaneously reduce race and gender biases, with almost no reduction in the classifier's overall true positive rate.
△ Less
Submitted 10 April, 2019;
originally announced April 2019.
-
Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting
Authors:
Maria De-Arteaga,
Alexey Romanov,
Hanna Wallach,
Jennifer Chayes,
Christian Borgs,
Alexandra Chouldechova,
Sahin Geyik,
Krishnaram Kenthapadi,
Adam Tauman Kalai
Abstract:
We present a large-scale study of gender bias in occupation classification, a task where the use of machine learning may lead to negative outcomes on peoples' lives. We analyze the potential allocation harms that can result from semantic representation bias. To do so, we study the impact on occupation classification of including explicit gender indicators---such as first names and pronouns---in di…
▽ More
We present a large-scale study of gender bias in occupation classification, a task where the use of machine learning may lead to negative outcomes on peoples' lives. We analyze the potential allocation harms that can result from semantic representation bias. To do so, we study the impact on occupation classification of including explicit gender indicators---such as first names and pronouns---in different semantic representations of online biographies. Additionally, we quantify the bias that remains when these indicators are "scrubbed," and describe proxy behavior that occurs in the absence of explicit gender indicators. As we demonstrate, differences in true positive rates between genders are correlated with existing gender imbalances in occupations, which may compound these imbalances.
△ Less
Submitted 27 January, 2019;
originally announced January 2019.
-
Proceedings of NeurIPS 2018 Workshop on Machine Learning for the Developing World: Achieving Sustainable Impact
Authors:
Maria De-Arteaga,
Amanda Coston,
William Herlands
Abstract:
This is the Proceedings of NeurIPS 2018 Workshop on Machine Learning for the Developing World: Achieving Sustainable Impact, held in Montreal, Canada on December 8, 2018
This is the Proceedings of NeurIPS 2018 Workshop on Machine Learning for the Developing World: Achieving Sustainable Impact, held in Montreal, Canada on December 8, 2018
△ Less
Submitted 18 February, 2019; v1 submitted 20 December, 2018;
originally announced December 2018.
-
What are the biases in my word embedding?
Authors:
Nathaniel Swinger,
Maria De-Arteaga,
Neil Thomas Heffernan IV,
Mark DM Leiserson,
Adam Tauman Kalai
Abstract:
This paper presents an algorithm for enumerating biases in word embeddings. The algorithm exposes a large number of offensive associations related to sensitive features such as race and gender on publicly available embeddings, including a supposedly "debiased" embedding. These biases are concerning in light of the widespread use of word embeddings. The associations are identified by geometric patt…
▽ More
This paper presents an algorithm for enumerating biases in word embeddings. The algorithm exposes a large number of offensive associations related to sensitive features such as race and gender on publicly available embeddings, including a supposedly "debiased" embedding. These biases are concerning in light of the widespread use of word embeddings. The associations are identified by geometric patterns in word embeddings that run parallel between people's names and common lower-case tokens. The algorithm is highly unsupervised: it does not even require the sensitive features to be pre-specified. This is desirable because: (a) many forms of discrimination--such as racial discrimination--are linked to social constructs that may vary depending on the context, rather than to categories with fixed definitions; and (b) it makes it easier to identify biases against intersectional groups, which depend on combinations of sensitive features. The inputs to our algorithm are a list of target tokens, e.g. names, and a word embedding. It outputs a number of Word Embedding Association Tests (WEATs) that capture various biases present in the data. We illustrate the utility of our approach on publicly available word embeddings and lists of names, and evaluate its output using crowdsourcing. We also show how removing names may not remove potential proxy bias.
△ Less
Submitted 19 June, 2019; v1 submitted 20 December, 2018;
originally announced December 2018.
-
Learning under selective labels in the presence of expert consistency
Authors:
Maria De-Arteaga,
Artur Dubrawski,
Alexandra Chouldechova
Abstract:
We explore the problem of learning under selective labels in the context of algorithm-assisted decision making. Selective labels is a pervasive selection bias problem that arises when historical decision making blinds us to the true outcome for certain instances. Examples of this are common in many applications, ranging from predicting recidivism using pre-trial release data to diagnosing patients…
▽ More
We explore the problem of learning under selective labels in the context of algorithm-assisted decision making. Selective labels is a pervasive selection bias problem that arises when historical decision making blinds us to the true outcome for certain instances. Examples of this are common in many applications, ranging from predicting recidivism using pre-trial release data to diagnosing patients. In this paper we discuss why selective labels often cannot be effectively tackled by standard methods for adjusting for sample selection bias, even if there are no unobservables. We propose a data augmentation approach that can be used to either leverage expert consistency to mitigate the partial blindness that results from selective labels, or to empirically validate whether learning under such framework may lead to unreliable models prone to systemic discrimination.
△ Less
Submitted 4 July, 2018; v1 submitted 2 July, 2018;
originally announced July 2018.
-
Proceedings of NIPS 2017 Workshop on Machine Learning for the Developing World
Authors:
Maria De-Arteaga,
William Herlands
Abstract:
This is the Proceedings of NIPS 2017 Workshop on Machine Learning for the Developing World, held in Long Beach, California, USA on December 8, 2017
This is the Proceedings of NIPS 2017 Workshop on Machine Learning for the Developing World, held in Long Beach, California, USA on December 8, 2017
△ Less
Submitted 12 December, 2017; v1 submitted 26 November, 2017;
originally announced November 2017.
-
Discovery of Complex Anomalous Patterns of Sexual Violence in El Salvador
Authors:
Maria De-Arteaga,
Artur Dubrawski
Abstract:
When sexual violence is a product of organized crime or social imaginary, the links between sexual violence episodes can be understood as a latent structure. With this assumption in place, we can use data science to uncover complex patterns. In this paper we focus on the use of data mining techniques to unveil complex anomalous spatiotemporal patterns of sexual violence. We illustrate their use by…
▽ More
When sexual violence is a product of organized crime or social imaginary, the links between sexual violence episodes can be understood as a latent structure. With this assumption in place, we can use data science to uncover complex patterns. In this paper we focus on the use of data mining techniques to unveil complex anomalous spatiotemporal patterns of sexual violence. We illustrate their use by analyzing all reported rapes in El Salvador over a period of nine years. Through our analysis, we are able to provide evidence of phenomena that, to the best of our knowledge, have not been previously reported in literature. We devote special attention to a pattern we discover in the East, where underage victims report their boyfriends as perpetrators at anomalously high rates. Finally, we explain how such analyzes could be conducted in real-time, enabling early detection of emerging patterns to allow law enforcement agencies and policy makers to react accordingly.
△ Less
Submitted 17 November, 2017;
originally announced November 2017.
-
Canonical Autocorrelation Analysis
Authors:
Maria De-Arteaga,
Artur Dubrawski,
Peter Huggins
Abstract:
We present an extension of sparse Canonical Correlation Analysis (CCA) designed for finding multiple-to-multiple linear correlations within a single set of variables. Unlike CCA, which finds correlations between two sets of data where the rows are matched exactly but the columns represent separate sets of variables, the method proposed here, Canonical Autocorrelation Analysis (CAA), finds multivar…
▽ More
We present an extension of sparse Canonical Correlation Analysis (CCA) designed for finding multiple-to-multiple linear correlations within a single set of variables. Unlike CCA, which finds correlations between two sets of data where the rows are matched exactly but the columns represent separate sets of variables, the method proposed here, Canonical Autocorrelation Analysis (CAA), finds multivariate correlations within just one set of variables. This can be useful when we look for hidden parsimonious structures in data, each involving only a small subset of all features. In addition, the discovered correlations are highly interpretable as they are formed by pairs of sparse linear combinations of the original features. We show how CAA can be of use as a tool for anomaly detection when the expected structure of correlations is not followed by anomalous data. We illustrate the utility of CAA in two application domains where single-class and unsupervised learning of correlation structures are particularly relevant: breast cancer diagnosis and radiation threat detection. When applied to the Wisconsin Breast Cancer data, single-class CAA is competitive with supervised methods used in literature. On the radiation threat detection task, unsupervised CAA performs significantly better than an unsupervised alternative prevalent in the domain, while providing valuable additional insights for threat analysis.
△ Less
Submitted 19 November, 2015;
originally announced November 2015.
-
Lass-0: sparse non-convex regression by local search
Authors:
William Herlands,
Maria De-Arteaga,
Daniel Neill,
Artur Dubrawski
Abstract:
We compute approximate solutions to L0 regularized linear regression using L1 regularization, also known as the Lasso, as an initialization step. Our algorithm, the Lass-0 ("Lass-zero"), uses a computationally efficient stepwise search to determine a locally optimal L0 solution given any L1 regularization solution. We present theoretical results of consistency under orthogonality and appropriate h…
▽ More
We compute approximate solutions to L0 regularized linear regression using L1 regularization, also known as the Lasso, as an initialization step. Our algorithm, the Lass-0 ("Lass-zero"), uses a computationally efficient stepwise search to determine a locally optimal L0 solution given any L1 regularization solution. We present theoretical results of consistency under orthogonality and appropriate handling of redundant features. Empirically, we use synthetic data to demonstrate that Lass-0 solutions are closer to the true sparse support than L1 regularization models. Additionally, in real-world data Lass-0 finds more parsimonious solutions than L1 regularization while maintaining similar predictive accuracy.
△ Less
Submitted 17 February, 2016; v1 submitted 13 November, 2015;
originally announced November 2015.