-
Can we Debias Social Stereotypes in AI-Generated Images? Examining Text-to-Image Outputs and User Perceptions
Authors:
Saharsh Barve,
Andy Mao,
Jiayue Melissa Shi,
Prerna Juneja,
Koustuv Saha
Abstract:
Recent advances in generative AI have enabled visual content creation through text-to-image (T2I) generation. However, despite their creative potential, T2I models often replicate and amplify societal stereotypes -- particularly those related to gender, race, and culture -- raising important ethical concerns. This paper proposes a theory-driven bias detection rubric and a Social Stereotype Index (…
▽ More
Recent advances in generative AI have enabled visual content creation through text-to-image (T2I) generation. However, despite their creative potential, T2I models often replicate and amplify societal stereotypes -- particularly those related to gender, race, and culture -- raising important ethical concerns. This paper proposes a theory-driven bias detection rubric and a Social Stereotype Index (SSI) to systematically evaluate social biases in T2I outputs. We audited three major T2I model outputs -- DALL-E-3, Midjourney-6.1, and Stability AI Core -- using 100 queries across three categories -- geocultural, occupational, and adjectival. Our analysis reveals that initial outputs are prone to include stereotypical visual cues, including gendered professions, cultural markers, and western beauty norms. To address this, we adopted our rubric to conduct targeted prompt refinement using LLMs, which significantly reduced bias -- SSI dropped by 61% for geocultural, 69% for occupational, and 51% for adjectival queries. We complemented our quantitative analysis through a user study examining perceptions, awareness, and preferences around AI-generated biased imagery. Our findings reveal a key tension -- although prompt refinement can mitigate stereotypes, it can limit contextual alignment. Interestingly, users often perceived stereotypical images to be more aligned with their expectations. We discuss the need to balance ethical debiasing with contextual relevance and call for T2I systems that support global diversity and inclusivity while not compromising the reflection of real-world social complexity.
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
Algorithmic Behaviors Across Regions: A Geolocation Audit of YouTube Search for COVID-19 Misinformation Between the United States and South Africa
Authors:
Hayoung Jung,
Prerna Juneja,
Tanushree Mitra
Abstract:
Despite being an integral tool for finding health-related information online, YouTube has faced criticism for disseminating COVID-19 misinformation globally to its users. Yet, prior audit studies have predominantly investigated YouTube within the Global North contexts, often overlooking the Global South. To address this gap, we conducted a comprehensive 10-day geolocation-based audit on YouTube to…
▽ More
Despite being an integral tool for finding health-related information online, YouTube has faced criticism for disseminating COVID-19 misinformation globally to its users. Yet, prior audit studies have predominantly investigated YouTube within the Global North contexts, often overlooking the Global South. To address this gap, we conducted a comprehensive 10-day geolocation-based audit on YouTube to compare the prevalence of COVID-19 misinformation in search results between the United States (US) and South Africa (SA), the countries heavily affected by the pandemic in the Global North and the Global South, respectively. For each country, we selected 3 geolocations and placed sock-puppets, or bots emulating "real" users, that collected search results for 48 search queries sorted by 4 search filters for 10 days, yielding a dataset of 915K results. We found that 31.55% of the top-10 search results contained COVID-19 misinformation. Among the top-10 search results, bots in SA faced significantly more misinformative search results than their US counterparts. Overall, our study highlights the contrasting algorithmic behaviors of YouTube search between two countries, underscoring the need for the platform to regulate algorithmic behavior consistently across different regions of the Globe.
△ Less
Submitted 14 April, 2025; v1 submitted 16 September, 2024;
originally announced September 2024.
-
The US Algorithmic Accountability Act of 2022 vs. The EU Artificial Intelligence Act: What can they learn from each other?
Authors:
Jakob Mokander,
Prathm Juneja,
David Watson,
Luciano Floridi
Abstract:
On the whole, the U.S. Algorithmic Accountability Act of 2022 (US AAA) is a pragmatic approach to balancing the benefits and risks of automated decision systems. Yet there is still room for improvement. This commentary highlights how the US AAA can both inform and learn from the European Artificial Intelligence Act (EU AIA).
On the whole, the U.S. Algorithmic Accountability Act of 2022 (US AAA) is a pragmatic approach to balancing the benefits and risks of automated decision systems. Yet there is still room for improvement. This commentary highlights how the US AAA can both inform and learn from the European Artificial Intelligence Act (EU AIA).
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
Artificial Intelligence for the Internal Democracy of Political Parties
Authors:
Claudio Novelli,
Giuliano Formisano,
Prathm Juneja,
Giulia Sandri,
Luciano Floridi
Abstract:
The article argues that AI can enhance the measurement and implementation of democratic processes within political parties, known as Intra-Party Democracy (IPD). It identifies the limitations of traditional methods for measuring IPD, which often rely on formal parameters, self-reported data, and tools like surveys. Such limitations lead to the collection of partial data, rare updates, and signific…
▽ More
The article argues that AI can enhance the measurement and implementation of democratic processes within political parties, known as Intra-Party Democracy (IPD). It identifies the limitations of traditional methods for measuring IPD, which often rely on formal parameters, self-reported data, and tools like surveys. Such limitations lead to the collection of partial data, rare updates, and significant demands on resources. To address these issues, the article suggests that specific data management and Machine Learning (ML) techniques, such as natural language processing and sentiment analysis, can improve the measurement (ML about) and practice (ML for) of IPD. The article concludes by considering some of the principal risks of ML for IPD, including concerns over data privacy, the potential for manipulation, and the dangers of overreliance on technology.
△ Less
Submitted 26 October, 2024; v1 submitted 2 April, 2024;
originally announced May 2024.
-
Viblio: Introducing Credibility Signals and Citations to Video-Sharing Platforms
Authors:
Emelia Hughes,
Renee Wang,
Prerna Juneja,
Tony Li,
Tanu Mitra,
Amy Zhang
Abstract:
As more users turn to video-sharing platforms like YouTube as an information source, they may consume misinformation despite their best efforts. In this work, we investigate ways that users can better assess the credibility of videos by first exploring how users currently determine credibility using existing signals on platforms and then by introducing and evaluating new credibility-based signals.…
▽ More
As more users turn to video-sharing platforms like YouTube as an information source, they may consume misinformation despite their best efforts. In this work, we investigate ways that users can better assess the credibility of videos by first exploring how users currently determine credibility using existing signals on platforms and then by introducing and evaluating new credibility-based signals. We conducted 12 contextual inquiry interviews with YouTube users, determining that participants used a combination of existing signals, such as the channel name, the production quality, and prior knowledge, to evaluate credibility, yet sometimes stumbled in their efforts to do so. We then developed Viblio, a prototype system that enables YouTube users to view and add citations and related information while watching a video based on our participants' needs. From an evaluation with 12 people, all participants found Viblio to be intuitive and useful in the process of evaluating a video's credibility and could see themselves using Viblio in the future.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Dissecting users' needs for search result explanations
Authors:
Prerna Juneja,
Wenjuan Zhang,
Alison Marie Smith-Renner,
Hemank Lamba,
Joel Tetreault,
Alex Jaimes
Abstract:
There is a growing demand for transparency in search engines to understand how search results are curated and to enhance users' trust. Prior research has introduced search result explanations with a focus on how to explain, assuming explanations are beneficial. Our study takes a step back to examine if search explanations are needed and when they are likely to provide benefits. Additionally, we su…
▽ More
There is a growing demand for transparency in search engines to understand how search results are curated and to enhance users' trust. Prior research has introduced search result explanations with a focus on how to explain, assuming explanations are beneficial. Our study takes a step back to examine if search explanations are needed and when they are likely to provide benefits. Additionally, we summarize key characteristics of helpful explanations and share users' perspectives on explanation features provided by Google and Bing. Interviews with non-technical individuals reveal that users do not always seek or understand search explanations and mostly desire them for complex and critical tasks. They find Google's search explanations too obvious but appreciate the ability to contest search results. Based on our findings, we offer design recommendations for search engines and explanations to help users better evaluate search results and enhance their search experience.
△ Less
Submitted 23 February, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
Assessing enactment of content regulation policies: A post hoc crowd-sourced audit of election misinformation on YouTube
Authors:
Prerna Juneja,
Md Momen Bhuiyan,
Tanushree Mitra
Abstract:
With the 2022 US midterm elections approaching, conspiratorial claims about the 2020 presidential elections continue to threaten users' trust in the electoral process. To regulate election misinformation, YouTube introduced policies to remove such content from its searches and recommendations. In this paper, we conduct a 9-day crowd-sourced audit on YouTube to assess the extent of enactment of suc…
▽ More
With the 2022 US midterm elections approaching, conspiratorial claims about the 2020 presidential elections continue to threaten users' trust in the electoral process. To regulate election misinformation, YouTube introduced policies to remove such content from its searches and recommendations. In this paper, we conduct a 9-day crowd-sourced audit on YouTube to assess the extent of enactment of such policies. We recruited 99 users who installed a browser extension that enabled us to collect up-next recommendation trails and search results for 45 videos and 88 search queries about the 2020 elections. We find that YouTube's search results, irrespective of search query bias, contain more videos that oppose rather than support election misinformation. However, watching misinformative election videos still lead users to a small number of misinformative videos in the up-next trails. Our results imply that while YouTube largely seems successful in regulating election misinformation, there is still room for improvement.
△ Less
Submitted 15 February, 2023;
originally announced February 2023.
-
Multi-class Classifier based Failure Prediction with Artificial and Anonymous Training for Data Privacy
Authors:
Dibakar Das,
Vikram Seshasai,
Vineet Sudhir Bhat,
Pushkal Juneja,
Jyotsna Bapat,
Debabrata Das
Abstract:
This paper proposes a novel non-intrusive system failure prediction technique using available information from developers and minimal information from raw logs (rather than mining entire logs) but keeping the data entirely private with the data owners. A neural network based multi-class classifier is developed for failure prediction, using artificially generated anonymous data set, applying a comb…
▽ More
This paper proposes a novel non-intrusive system failure prediction technique using available information from developers and minimal information from raw logs (rather than mining entire logs) but keeping the data entirely private with the data owners. A neural network based multi-class classifier is developed for failure prediction, using artificially generated anonymous data set, applying a combination of techniques, viz., genetic algorithm (steps), pattern repetition, etc., to train and test the network. The proposed mechanism completely decouples the data set used for training process from the actual data which is kept private. Moreover, multi-criteria decision making (MCDM) schemes are used to prioritize failures meeting business requirements. Results show high accuracy in failure prediction under different parameter configurations. On a broader context, any classification problem, beyond failure prediction, can be performed using the proposed mechanism with artificially generated data set without looking into the actual data as long as the input features can be translated to binary values (e.g. output from private binary classifiers) and can provide classification-as-a-service.
△ Less
Submitted 19 September, 2024; v1 submitted 6 September, 2022;
originally announced September 2022.
-
Human and technological infrastructures of fact-checking
Authors:
Prerna Juneja,
Tanushree Mitra
Abstract:
Increasing demands for fact-checking has led to a growing interest in developing systems and tools to automate the fact-checking process. However, such systems are limited in practice because their system design often does not take into account how fact-checking is done in the real world and ignores the insights and needs of various stakeholder groups core to the fact-checking process. This paper…
▽ More
Increasing demands for fact-checking has led to a growing interest in developing systems and tools to automate the fact-checking process. However, such systems are limited in practice because their system design often does not take into account how fact-checking is done in the real world and ignores the insights and needs of various stakeholder groups core to the fact-checking process. This paper unpacks the fact-checking process by revealing the infrastructures -- both human and technological -- that support and shape fact-checking work. We interviewed 26 participants belonging to 16 fact-checking teams and organizations with representation from 4 continents. Through these interviews, we describe the human infrastructure of fact-checking by identifying and presenting, in-depth, the roles of six primary stakeholder groups, 1) Editors, 2) External fact-checkers, 3) In-house fact-checkers, 4) Investigators and researchers, 5) Social media managers, and 6) Advocators. Our findings highlight that the fact-checking process is a collaborative effort among various stakeholder groups and associated technological and informational infrastructures. By rendering visibility to the infrastructures, we reveal how fact-checking has evolved to include both short-term claims centric and long-term advocacy centric fact-checking. Our work also identifies key social and technical needs and challenges faced by each stakeholder group. Based on our findings, we suggest that improving the quality of fact-checking requires systematic changes in the civic, informational, and technological contexts.
△ Less
Submitted 22 May, 2022;
originally announced May 2022.
-
Algorithmic nudge to make better choices: Evaluating effectiveness of XAI frameworks to reveal biases in algorithmic decision making to users
Authors:
Prerna Juneja,
Tanushree Mitra
Abstract:
In this position paper, we propose the use of existing XAI frameworks to design interventions in scenarios where algorithms expose users to problematic content (e.g. anti vaccine videos). Our intervention design includes facts (to indicate algorithmic justification of what happened) accompanied with either fore warnings or counterfactual explanations. While fore warnings indicate potential risks o…
▽ More
In this position paper, we propose the use of existing XAI frameworks to design interventions in scenarios where algorithms expose users to problematic content (e.g. anti vaccine videos). Our intervention design includes facts (to indicate algorithmic justification of what happened) accompanied with either fore warnings or counterfactual explanations. While fore warnings indicate potential risks of an action to users, the counterfactual explanations will indicate what actions user should perform to change the algorithmic outcome. We envision the use of such interventions as `decision aids' to users which will help them make informed choices.
△ Less
Submitted 4 February, 2022;
originally announced February 2022.
-
Auditing E-Commerce Platforms for Algorithmically Curated Vaccine Misinformation
Authors:
Prerna Juneja,
Tanushree Mitra
Abstract:
There is a growing concern that e-commerce platforms are amplifying vaccine-misinformation. To investigate, we conduct two-sets of algorithmic audits for vaccine misinformation on the search and recommendation algorithms of Amazon -- world's leading e-retailer. First, we systematically audit search-results belonging to vaccine-related search-queries without logging into the platform -- unpersonali…
▽ More
There is a growing concern that e-commerce platforms are amplifying vaccine-misinformation. To investigate, we conduct two-sets of algorithmic audits for vaccine misinformation on the search and recommendation algorithms of Amazon -- world's leading e-retailer. First, we systematically audit search-results belonging to vaccine-related search-queries without logging into the platform -- unpersonalized audits. We find 10.47% of search-results promote misinformative health products. We also observe ranking-bias, with Amazon ranking misinformative search-results higher than debunking search-results. Next, we analyze the effects of personalization due to account-history, where history is built progressively by performing various real-world user-actions, such as clicking a product. We find evidence of filter-bubble effect in Amazon's recommendations; accounts performing actions on misinformative products are presented with more misinformation compared to accounts performing actions on neutral and debunking products. Interestingly, once user clicks on a misinformative product, homepage recommendations become more contaminated compared to when user shows an intention to buy that product.
△ Less
Submitted 29 January, 2021; v1 submitted 20 January, 2021;
originally announced January 2021.
-
Anvaya: An Algorithm and Case-Study on Improving the Goodness of Software Process Models generated by Mining Event-Log Data in Issue Tracking System
Authors:
Prerna Juneja,
Divya Kundra,
Ashish Sureka
Abstract:
Issue Tracking Systems (ITS) such as Bugzilla can be viewed as Process Aware Information Systems (PAIS) generating event-logs during the life-cycle of a bug report. Process Mining consists of mining event logs generated from PAIS for process model discovery, conformance and enhancement. We apply process map discovery techniques to mine event trace data generated from ITS of open source Firefox bro…
▽ More
Issue Tracking Systems (ITS) such as Bugzilla can be viewed as Process Aware Information Systems (PAIS) generating event-logs during the life-cycle of a bug report. Process Mining consists of mining event logs generated from PAIS for process model discovery, conformance and enhancement. We apply process map discovery techniques to mine event trace data generated from ITS of open source Firefox browser project to generate and study process models. Bug life-cycle consists of diversity and variance. Therefore, the process models generated from the event-logs are spaghetti-like with large number of edges, inter-connections and nodes. Such models are complex to analyse and difficult to comprehend by a process analyst. We improve the Goodness (fitness and structural complexity) of the process models by splitting the event-log into homogeneous subsets by clustering structurally similar traces. We adapt the K-Medoid clustering algorithm with two different distance metrics: Longest Common Subsequence (LCS) and Dynamic Time Warping (DTW). We evaluate the goodness of the process models generated from the clusters using complexity and fitness metrics. We study back-forth \& self-loops, bug reopening, and bottleneck in the clusters obtained and show that clustering enables better analysis. We also propose an algorithm to automate the clustering process -the algorithm takes as input the event log and returns the best cluster set.
△ Less
Submitted 22 November, 2015;
originally announced November 2015.