Bias Delayed is Bias Denied? Assessing the Effect of Reporting Delays on Disparity Assessments
Authors:
Jennah Gosciak,
Aparna Balagopalan,
Derek Ouyang,
Allison Koenecke,
Marzyeh Ghassemi,
Daniel E. Ho
Abstract:
Conducting disparity assessments at regular time intervals is critical for surfacing potential biases in decision-making and improving outcomes across demographic groups. Because disparity assessments fundamentally depend on the availability of demographic information, their efficacy is limited by the availability and consistency of available demographic identifiers. While prior work has considere…
▽ More
Conducting disparity assessments at regular time intervals is critical for surfacing potential biases in decision-making and improving outcomes across demographic groups. Because disparity assessments fundamentally depend on the availability of demographic information, their efficacy is limited by the availability and consistency of available demographic identifiers. While prior work has considered the impact of missing data on fairness, little attention has been paid to the role of delayed demographic data. Delayed data, while eventually observed, might be missing at the critical point of monitoring and action -- and delays may be unequally distributed across groups in ways that distort disparity assessments. We characterize such impacts in healthcare, using electronic health records of over 5M patients across primary care practices in all 50 states. Our contributions are threefold. First, we document the high rate of race and ethnicity reporting delays in a healthcare setting and demonstrate widespread variation in rates at which demographics are reported across different groups. Second, through a set of retrospective analyses using real data, we find that such delays impact disparity assessments and hence conclusions made across a range of consequential healthcare outcomes, particularly at more granular levels of state-level and practice-level assessments. Third, we find limited ability of conventional methods that impute missing race in mitigating the effects of reporting delays on the accuracy of timely disparity assessments. Our insights and methods generalize to many domains of algorithmic fairness where delays in the availability of sensitive information may confound audits, thus deserving closer attention within a pipeline-aware machine learning framework.
△ Less
Submitted 16 June, 2025;
originally announced June 2025.
AI Rules? Characterizing Reddit Community Policies Towards AI-Generated Content
Authors:
Travis Lloyd,
Jennah Gosciak,
Tung Nguyen,
Mor Naaman
Abstract:
How are Reddit communities responding to AI-generated content? We explored this question through a large-scale analysis of subreddit community rules and their change over time. We collected the metadata and community rules for over $300,000$ public subreddits and measured the prevalence of rules governing AI. We labeled subreddits and AI rules according to existing taxonomies from the HCI literatu…
▽ More
How are Reddit communities responding to AI-generated content? We explored this question through a large-scale analysis of subreddit community rules and their change over time. We collected the metadata and community rules for over $300,000$ public subreddits and measured the prevalence of rules governing AI. We labeled subreddits and AI rules according to existing taxonomies from the HCI literature and a new taxonomy we developed specific to AI rules. While rules about AI are still relatively uncommon, the number of subreddits with these rules more than doubled over the course of a year. AI rules are more common in larger subreddits and communities focused on art or celebrity topics, and less common in those focused on social support. These rules often focus on AI images and evoke, as justification, concerns about quality and authenticity. Overall, our findings illustrate the emergence of varied concerns about AI, in different community contexts. Platform designers and HCI researchers should heed these concerns if they hope to encourage community self-determination in the age of generative AI. We make our datasets public to enable future large-scale studies of community self-governance.
△ Less
Submitted 16 March, 2025; v1 submitted 15 October, 2024;
originally announced October 2024.