-
Understanding Gender Bias in AI-Generated Product Descriptions
Authors:
Markelle Kelly,
Mohammad Tahaei,
Padhraic Smyth,
Lauren Wilcox
Abstract:
While gender bias in large language models (LLMs) has been extensively studied in many domains, uses of LLMs in e-commerce remain largely unexamined and may reveal novel forms of algorithmic bias and harm. Our work investigates this space, developing data-driven taxonomic categories of gender bias in the context of product description generation, which we situate with respect to existing general p…
▽ More
While gender bias in large language models (LLMs) has been extensively studied in many domains, uses of LLMs in e-commerce remain largely unexamined and may reveal novel forms of algorithmic bias and harm. Our work investigates this space, developing data-driven taxonomic categories of gender bias in the context of product description generation, which we situate with respect to existing general purpose harms taxonomies. We illustrate how AI-generated product descriptions can uniquely surface gender biases in ways that require specialized detection and mitigation approaches. Further, we quantitatively analyze issues corresponding to our taxonomic categories in two models used for this task -- GPT-3.5 and an e-commerce-specific LLM -- demonstrating that these forms of bias commonly occur in practice. Our results illuminate unique, under-explored dimensions of gender bias, such as assumptions about clothing size, stereotypical bias in which features of a product are advertised, and differences in the use of persuasive language. These insights contribute to our understanding of three types of AI harms identified by current frameworks: exclusionary norms, stereotyping, and performance disparities, particularly for the context of e-commerce.
△ Less
Submitted 3 June, 2025;
originally announced June 2025.
-
The Problems with Proxies: Making Data Work Visible through Requester Practices
Authors:
Annabel Rothschild,
Ding Wang,
Niveditha Jayakumar Vilvanathan,
Lauren Wilcox,
Carl DiSalvo,
Betsy DiSalvo
Abstract:
Fairness in AI and ML systems is increasingly linked to the proper treatment and recognition of data workers involved in training dataset development. Yet, those who collect and annotate the data, and thus have the most intimate knowledge of its development, are often excluded from critical discussions. This exclusion prevents data annotators, who are domain experts, from contributing effectively…
▽ More
Fairness in AI and ML systems is increasingly linked to the proper treatment and recognition of data workers involved in training dataset development. Yet, those who collect and annotate the data, and thus have the most intimate knowledge of its development, are often excluded from critical discussions. This exclusion prevents data annotators, who are domain experts, from contributing effectively to dataset contextualization. Our investigation into the hiring and engagement practices of 52 data work requesters on platforms like Amazon Mechanical Turk reveals a gap: requesters frequently hold naive or unchallenged notions of worker identities and capabilities and rely on ad-hoc qualification tasks that fail to respect the workers' expertise. These practices not only undermine the quality of data but also the ethical standards of AI development. To rectify these issues, we advocate for policy changes to enhance how data annotation tasks are designed and managed and to ensure data workers are treated with the respect they deserve.
△ Less
Submitted 21 August, 2024;
originally announced August 2024.
-
Surveys Considered Harmful? Reflecting on the Use of Surveys in AI Research, Development, and Governance
Authors:
Mohammmad Tahaei,
Daricia Wilkinson,
Alisa Frik,
Michael Muller,
Ruba Abu-Salma,
Lauren Wilcox
Abstract:
Calls for engagement with the public in Artificial Intelligence (AI) research, development, and governance are increasing, leading to the use of surveys to capture people's values, perceptions, and experiences related to AI. In this paper, we critically examine the state of human participant surveys associated with these topics. Through both a reflexive analysis of a survey pilot spanning six coun…
▽ More
Calls for engagement with the public in Artificial Intelligence (AI) research, development, and governance are increasing, leading to the use of surveys to capture people's values, perceptions, and experiences related to AI. In this paper, we critically examine the state of human participant surveys associated with these topics. Through both a reflexive analysis of a survey pilot spanning six countries and a systematic literature review of 44 papers featuring public surveys related to AI, we explore prominent perspectives and methodological nuances associated with surveys to date. We find that public surveys on AI topics are vulnerable to specific Western knowledge, values, and assumptions in their design, including in their positioning of ethical concepts and societal values, lack sufficient critical discourse surrounding deployment strategies, and demonstrate inconsistent forms of transparency in their reporting. Based on our findings, we distill provocations and heuristic questions for our community, to recognize the limitations of surveys for meeting the goals of engagement, and to cultivate shared principles to design, deploy, and interpret surveys cautiously and responsibly.
△ Less
Submitted 26 July, 2024;
originally announced August 2024.
-
Implications of Regulations on the Use of AI and Generative AI for Human-Centered Responsible Artificial Intelligence
Authors:
Marios Constantinides,
Mohammad Tahaei,
Daniele Quercia,
Simone Stumpf,
Michael Madaio,
Sean Kennedy,
Lauren Wilcox,
Jessica Vitak,
Henriette Cramer,
Edyta Bogucka,
Ricardo Baeza-Yates,
Ewa Luger,
Jess Holbrook,
Michael Muller,
Ilana Golbin Blumenfeld,
Giada Pistilli
Abstract:
With the upcoming AI regulations (e.g., EU AI Act) and rapid advancements in generative AI, new challenges emerge in the area of Human-Centered Responsible Artificial Intelligence (HCR-AI). As AI becomes more ubiquitous, questions around decision-making authority, human oversight, accountability, sustainability, and the ethical and legal responsibilities of AI and their creators become paramount.…
▽ More
With the upcoming AI regulations (e.g., EU AI Act) and rapid advancements in generative AI, new challenges emerge in the area of Human-Centered Responsible Artificial Intelligence (HCR-AI). As AI becomes more ubiquitous, questions around decision-making authority, human oversight, accountability, sustainability, and the ethical and legal responsibilities of AI and their creators become paramount. Addressing these questions requires a collaborative approach. By involving stakeholders from various disciplines in the 2\textsuperscript{nd} edition of the HCR-AI Special Interest Group (SIG) at CHI 2024, we aim to discuss the implications of regulations in HCI research, develop new theories, evaluation frameworks, and methods to navigate the complex nature of AI ethics, steering AI development in a direction that is beneficial and sustainable for all of humanity.
△ Less
Submitted 29 February, 2024;
originally announced March 2024.
-
Farsight: Fostering Responsible AI Awareness During AI Application Prototyping
Authors:
Zijie J. Wang,
Chinmay Kulkarni,
Lauren Wilcox,
Michael Terry,
Michael Madaio
Abstract:
Prompt-based interfaces for Large Language Models (LLMs) have made prototyping and building AI-powered applications easier than ever before. However, identifying potential harms that may arise from AI applications remains a challenge, particularly during prompt-based prototyping. To address this, we present Farsight, a novel in situ interactive tool that helps people identify potential harms from…
▽ More
Prompt-based interfaces for Large Language Models (LLMs) have made prototyping and building AI-powered applications easier than ever before. However, identifying potential harms that may arise from AI applications remains a challenge, particularly during prompt-based prototyping. To address this, we present Farsight, a novel in situ interactive tool that helps people identify potential harms from the AI applications they are prototyping. Based on a user's prompt, Farsight highlights news articles about relevant AI incidents and allows users to explore and edit LLM-generated use cases, stakeholders, and harms. We report design insights from a co-design study with 10 AI prototypers and findings from a user study with 42 AI prototypers. After using Farsight, AI prototypers in our user study are better able to independently identify potential harms associated with a prompt and find our tool more useful and usable than existing resources. Their qualitative feedback also highlights that Farsight encourages them to focus on end-users and think beyond immediate harms. We discuss these findings and reflect on their implications for designing AI prototyping experiences that meaningfully engage with AI harms. Farsight is publicly accessible at: https://PAIR-code.github.io/farsight.
△ Less
Submitted 2 July, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
How Knowledge Workers Think Generative AI Will (Not) Transform Their Industries
Authors:
Allison Woodruff,
Renee Shelby,
Patrick Gage Kelley,
Steven Rousso-Schindler,
Jamila Smith-Loud,
Lauren Wilcox
Abstract:
Generative AI is expected to have transformative effects in multiple knowledge industries. To better understand how knowledge workers expect generative AI may affect their industries in the future, we conducted participatory research workshops for seven different industries, with a total of 54 participants across three US cities. We describe participants' expectations of generative AI's impact, in…
▽ More
Generative AI is expected to have transformative effects in multiple knowledge industries. To better understand how knowledge workers expect generative AI may affect their industries in the future, we conducted participatory research workshops for seven different industries, with a total of 54 participants across three US cities. We describe participants' expectations of generative AI's impact, including a dominant narrative that cut across the groups' discourse: participants largely envision generative AI as a tool to perform menial work, under human review. Participants do not generally anticipate the disruptive changes to knowledge industries currently projected in common media and academic narratives. Participants do however envision generative AI may amplify four social forces currently shaping their industries: deskilling, dehumanization, disconnection, and disinformation. We describe these forces, and then we provide additional detail regarding attitudes in specific knowledge industries. We conclude with a discussion of implications and research challenges for the HCI community.
△ Less
Submitted 20 March, 2024; v1 submitted 10 October, 2023;
originally announced October 2023.
-
A Systematic Review and Thematic Analysis of Community-Collaborative Approaches to Computing Research
Authors:
Ned Cooper,
Tiffanie Horne,
Gillian Hayes,
Courtney Heldreth,
Michal Lahav,
Jess Scon Holbrook,
Lauren Wilcox
Abstract:
HCI researchers have been gradually shifting attention from individual users to communities when engaging in research, design, and system development. However, our field has yet to establish a cohesive, systematic understanding of the challenges, benefits, and commitments of community-collaborative approaches to research. We conducted a systematic review and thematic analysis of 47 computing resea…
▽ More
HCI researchers have been gradually shifting attention from individual users to communities when engaging in research, design, and system development. However, our field has yet to establish a cohesive, systematic understanding of the challenges, benefits, and commitments of community-collaborative approaches to research. We conducted a systematic review and thematic analysis of 47 computing research papers discussing participatory research with communities for the development of technological artifacts and systems, published over the last two decades. From this review, we identified seven themes associated with the evolution of a project: from establishing community partnerships to sustaining results. Our findings suggest that several tensions characterize these projects, many of which relate to the power and position of researchers, and the computing research environment, relative to community partners. We discuss the implications of our findings and offer methodological proposals to guide HCI, and computing research more broadly, towards practices that center communities.
△ Less
Submitted 8 July, 2022;
originally announced July 2022.
-
Healthsheet: Development of a Transparency Artifact for Health Datasets
Authors:
Negar Rostamzadeh,
Diana Mincu,
Subhrajit Roy,
Andrew Smart,
Lauren Wilcox,
Mahima Pushkarna,
Jessica Schrouff,
Razvan Amironesei,
Nyalleng Moorosi,
Katherine Heller
Abstract:
Machine learning (ML) approaches have demonstrated promising results in a wide range of healthcare applications. Data plays a crucial role in developing ML-based healthcare systems that directly affect people's lives. Many of the ethical issues surrounding the use of ML in healthcare stem from structural inequalities underlying the way we collect, use, and handle data. Developing guidelines to imp…
▽ More
Machine learning (ML) approaches have demonstrated promising results in a wide range of healthcare applications. Data plays a crucial role in developing ML-based healthcare systems that directly affect people's lives. Many of the ethical issues surrounding the use of ML in healthcare stem from structural inequalities underlying the way we collect, use, and handle data. Developing guidelines to improve documentation practices regarding the creation, use, and maintenance of ML healthcare datasets is therefore of critical importance. In this work, we introduce Healthsheet, a contextualized adaptation of the original datasheet questionnaire ~\cite{gebru2018datasheets} for health-specific applications. Through a series of semi-structured interviews, we adapt the datasheets for healthcare data documentation. As part of the Healthsheet development process and to understand the obstacles researchers face in creating datasheets, we worked with three publicly-available healthcare datasets as our case studies, each with different types of structured data: Electronic health Records (EHR), clinical trial study data, and smartphone-based performance outcome measures. Our findings from the interviewee study and case studies show 1) that datasheets should be contextualized for healthcare, 2) that despite incentives to adopt accountability practices such as datasheets, there is a lack of consistency in the broader use of these practices 3) how the ML for health community views datasheets and particularly \textit{Healthsheets} as diagnostic tool to surface the limitations and strength of datasets and 4) the relative importance of different fields in the datasheet to healthcare concerns.
△ Less
Submitted 25 February, 2022;
originally announced February 2022.
-
It's Time to Do Something: Mitigating the Negative Impacts of Computing Through a Change to the Peer Review Process
Authors:
Brent Hecht,
Lauren Wilcox,
Jeffrey P. Bigham,
Johannes Schöning,
Ehsan Hoque,
Jason Ernst,
Yonatan Bisk,
Luigi De Russis,
Lana Yarosh,
Bushra Anjum,
Danish Contractor,
Cathy Wu
Abstract:
The computing research community needs to work much harder to address the downsides of our innovations. Between the erosion of privacy, threats to democracy, and automation's effect on employment (among many other issues), we can no longer simply assume that our research will have a net positive impact on the world. While bending the arc of computing innovation towards societal benefit may at firs…
▽ More
The computing research community needs to work much harder to address the downsides of our innovations. Between the erosion of privacy, threats to democracy, and automation's effect on employment (among many other issues), we can no longer simply assume that our research will have a net positive impact on the world. While bending the arc of computing innovation towards societal benefit may at first seem intractable, we believe we can achieve substantial progress with a straightforward step: making a small change to the peer review process. As we explain below, we hypothesize that our recommended change will force computing researchers to more deeply consider the negative impacts of their work. We also expect that this change will incentivize research and policy that alleviates computing's negative impacts.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Modernizing Data Control: Making Personal Digital Data Mutually Beneficial for Citizens and Industry
Authors:
Sujata Banerjee,
Yiling Chen,
Kobbi Nissim,
David Parkes,
Katie Siek,
Lauren Wilcox
Abstract:
We are entering a new "data everywhere-anytime" era that pivots us from being tracked online to continuous tracking as we move through our everyday lives. We have smart devices in our homes, on our bodies, and around our communities that collect data that is used to guide decisions that have a major impact on our lives - from loans to job interviews and judicial rulings to health care intervention…
▽ More
We are entering a new "data everywhere-anytime" era that pivots us from being tracked online to continuous tracking as we move through our everyday lives. We have smart devices in our homes, on our bodies, and around our communities that collect data that is used to guide decisions that have a major impact on our lives - from loans to job interviews and judicial rulings to health care interventions. We create a lot of data, but who owns that data? How is it shared? How will it be used? While the average person does not have a good understanding of how the data is being used, they know that it carries risks for them and society.
Although some people may believe they own their data, in reality, the problem of understanding the myriad ways in which data is collected, shared, and used, and the consequences of these uses is so complex that only a few people want to manage their data themselves. Furthermore, much of the value in the data cannot be extracted by individuals alone, as it lies in the connections and insights garnered from (1) one's own personal data (is your fitness improving? Is your home more energy efficient than the average home of this size?) and (2) one's relationship with larger groups (demographic group voting blocks; friend network influence on purchasing). But sometimes these insights have unintended consequences for the person generating the data, especially in terms of loss of privacy, unfairness, inappropriate inferences, information bias, manipulation, and discrimination. There are also societal impacts, such as effects on speech freedoms, political manipulation, and amplified harms to weakened and underrepresented communities. To this end, we look at major questions that policymakers should ask and things to consider when addressing these data ownership concerns.
△ Less
Submitted 15 December, 2020;
originally announced December 2020.
-
Pain Intensity Estimation from Mobile Video Using 2D and 3D Facial Keypoints
Authors:
Matthew Lee,
Lyndon Kennedy,
Andreas Girgensohn,
Lynn Wilcox,
John Song En Lee,
Chin Wen Tan,
Ban Leong Sng
Abstract:
Managing post-surgical pain is critical for successful surgical outcomes. One of the challenges of pain management is accurately assessing the pain level of patients. Self-reported numeric pain ratings are limited because they are subjective, can be affected by mood, and can influence the patient's perception of pain when making comparisons. In this paper, we introduce an approach that analyzes 2D…
▽ More
Managing post-surgical pain is critical for successful surgical outcomes. One of the challenges of pain management is accurately assessing the pain level of patients. Self-reported numeric pain ratings are limited because they are subjective, can be affected by mood, and can influence the patient's perception of pain when making comparisons. In this paper, we introduce an approach that analyzes 2D and 3D facial keypoints of post-surgical patients to estimate their pain intensity level. Our approach leverages the previously unexplored capabilities of a smartphone to capture a dense 3D representation of a person's face as input for pain intensity level estimation. Our contributions are adata collection study with post-surgical patients to collect ground-truth labeled sequences of 2D and 3D facial keypoints for developing a pain estimation algorithm, a pain estimation model that uses multiple instance learning to overcome inherent limitations in facial keypoint sequences, and the preliminary results of the pain estimation model using 2D and 3D features with comparisons of alternate approaches.
△ Less
Submitted 16 June, 2020;
originally announced June 2020.
-
Fast Mesh Refinement in Pseudospectral Optimal Control
Authors:
N. Koeppen,
I. M. Ross,
L. C. Wilcox,
R. J. Proulx
Abstract:
Mesh refinement in pseudospectral (PS) optimal control is embarrassingly easy --- simply increase the order $N$ of the Lagrange interpolating polynomial and the mathematics of convergence automates the distribution of the grid points. Unfortunately, as $N$ increases, the condition number of the resulting linear algebra increases as $N^2$; hence, spectral efficiency and accuracy are lost in practic…
▽ More
Mesh refinement in pseudospectral (PS) optimal control is embarrassingly easy --- simply increase the order $N$ of the Lagrange interpolating polynomial and the mathematics of convergence automates the distribution of the grid points. Unfortunately, as $N$ increases, the condition number of the resulting linear algebra increases as $N^2$; hence, spectral efficiency and accuracy are lost in practice. In this paper, we advance Birkhoff interpolation concepts over an arbitrary grid to generate well-conditioned PS optimal control discretizations. We show that the condition number increases only as $\sqrt{N}$ in general, but is independent of $N$ for the special case of one of the boundary points being fixed. Hence, spectral accuracy and efficiency are maintained as $N$ increases. The effectiveness of the resulting fast mesh refinement strategy is demonstrated by using \underline{polynomials of over a thousandth order} to solve a low-thrust, long-duration orbit transfer problem.
△ Less
Submitted 29 April, 2019;
originally announced April 2019.
-
Array Program Transformation with Loo.py by Example: High-Order Finite Elements
Authors:
Andreas Klöckner,
Lucas C. Wilcox,
T. Warburton
Abstract:
To concisely and effectively demonstrate the capabilities of our program transformation system Loo.py, we examine a transformation path from two real-world Fortran subroutines as found in a weather model to a single high-performance computational kernel suitable for execution on modern GPU hardware. Along the transformation path, we encounter kernel fusion, vectorization, prefetch- ing, paralleliz…
▽ More
To concisely and effectively demonstrate the capabilities of our program transformation system Loo.py, we examine a transformation path from two real-world Fortran subroutines as found in a weather model to a single high-performance computational kernel suitable for execution on modern GPU hardware. Along the transformation path, we encounter kernel fusion, vectorization, prefetch- ing, parallelization, and algorithmic changes achieved by mechanized conversion between imperative and functional/substitution- based code, among a number more. We conclude with performance results that demonstrate the effects and support the effectiveness of the applied transformations.
△ Less
Submitted 13 April, 2016;
originally announced April 2016.
-
Strong Scaling for Numerical Weather Prediction at Petascale with the Atmospheric Model NUMA
Authors:
Andreas Müller,
Michal A. Kopera,
Simone Marras,
Lucas C. Wilcox,
Tobin Isaac,
Francis X. Giraldo
Abstract:
Numerical weather prediction (NWP) has proven to be computationally challenging due to its inherent multiscale nature. Currently, the highest resolution NWP models use a horizontal resolution of about 10km. In order to increase the resolution of NWP models highly scalable atmospheric models are needed.
The Non-hydrostatic Unified Model of the Atmosphere (NUMA), developed by the authors at the Na…
▽ More
Numerical weather prediction (NWP) has proven to be computationally challenging due to its inherent multiscale nature. Currently, the highest resolution NWP models use a horizontal resolution of about 10km. In order to increase the resolution of NWP models highly scalable atmospheric models are needed.
The Non-hydrostatic Unified Model of the Atmosphere (NUMA), developed by the authors at the Naval Postgraduate School, was designed to achieve this purpose. NUMA is used by the Naval Research Laboratory, Monterey as the engine inside its next generation weather prediction system NEPTUNE. NUMA solves the fully compressible Navier-Stokes equations by means of high-order Galerkin methods (both spectral element as well as discontinuous Galerkin methods can be used). Mesh generation is done using the p4est library. NUMA is capable of running middle and upper atmosphere simulations since it does not make use of the shallow-atmosphere approximation.
This paper presents the performance analysis and optimization of the spectral element version of NUMA. The performance at different optimization stages is analyzed using a theoretical performance model as well as measurements via hardware counters. Machine independent optimization is compared to machine specific optimization using BG/Q vector intrinsics. By using vector intrinsics the main computations reach 1.2 PFlops on the entire machine Mira (12% of the theoretical peak performance). The paper also presents scalability studies for two idealized test cases that are relevant for NWP applications. The atmospheric model NUMA delivers an excellent strong scaling efficiency of 99% on the entire supercomputer Mira using a mesh with 1.8 billion grid points. This allows to run a global forecast of a baroclinic wave test case at 3km uniform horizontal resolution and double precision within the time frame required for operational weather prediction.
△ Less
Submitted 8 September, 2016; v1 submitted 4 November, 2015;
originally announced November 2015.
-
Recursive Algorithms for Distributed Forests of Octrees
Authors:
Tobin Isaac,
Carsten Burstedde,
Lucas C. Wilcox,
Omar Ghattas
Abstract:
The forest-of-octrees approach to parallel adaptive mesh refinement and coarsening (AMR) has recently been demonstrated in the context of a number of large-scale PDE-based applications. Although linear octrees, which store only leaf octants, have an underlying tree structure by definition, it is not often exploited in previously published mesh-related algorithms. This is because the branches are n…
▽ More
The forest-of-octrees approach to parallel adaptive mesh refinement and coarsening (AMR) has recently been demonstrated in the context of a number of large-scale PDE-based applications. Although linear octrees, which store only leaf octants, have an underlying tree structure by definition, it is not often exploited in previously published mesh-related algorithms. This is because the branches are not explicitly stored, and because the topological relationships in meshes, such as the adjacency between cells, introduce dependencies that do not respect the octree hierarchy. In this work we combine hierarchical and topological relationships between octree branches to design efficient recursive algorithms.
We present three important algorithms with recursive implementations. The first is a parallel search for leaves matching any of a set of multiple search criteria. The second is a ghost layer construction algorithm that handles arbitrarily refined octrees that are not covered by previous algorithms, which require a 2:1 condition between neighboring leaves. The third is a universal mesh topology iterator. This iterator visits every cell in a domain partition, as well as every interface (face, edge and corner) between these cells. The iterator calculates the local topological information for every interface that it visits, taking into account the nonconforming interfaces that increase the complexity of describing the local topology. To demonstrate the utility of the topology iterator, we use it to compute the numbering and encoding of higher-order $C^0$ nodal basis functions.
We analyze the complexity of the new recursive algorithms theoretically, and assess their performance, both in terms of single-processor efficiency and in terms of parallel scalability, demonstrating good weak and strong scaling up to 458k cores of the JUQUEEN supercomputer.
△ Less
Submitted 19 August, 2015; v1 submitted 31 May, 2014;
originally announced June 2014.