Search | arXiv e-print repository

Empowering the Grid: Collaborative Edge Artificial Intelligence for Decentralized Energy Systems

Authors: Eddie de Paula Jr, Niel Bunda, Hezerul Abdul Karim, Nouar AlDahoul, Myles Joshua Toledo Tan

Abstract: This paper examines how decentralized energy systems can be enhanced using collaborative Edge Artificial Intelligence. Decentralized grids use local renewable sources to reduce transmission losses and improve energy security. Edge AI enables real-time, privacy-preserving data processing at the network edge. Techniques such as federated learning and distributed control improve demand response, equi… ▽ More This paper examines how decentralized energy systems can be enhanced using collaborative Edge Artificial Intelligence. Decentralized grids use local renewable sources to reduce transmission losses and improve energy security. Edge AI enables real-time, privacy-preserving data processing at the network edge. Techniques such as federated learning and distributed control improve demand response, equipment maintenance, and energy optimization. The paper discusses key challenges including data privacy, scalability, and interoperability, and suggests solutions such as blockchain integration and adaptive architectures. Examples from virtual power plants and smart grids highlight the potential of these technologies. The paper calls for increased investment, policy support, and collaboration to advance sustainable energy systems. △ Less

Submitted 11 May, 2025; originally announced May 2025.

Comments: 16 pages, 1 table

arXiv:2505.04171 [pdf, other]

Large Language Models are often politically extreme, usually ideologically inconsistent, and persuasive even in informational contexts

Authors: Nouar Aldahoul, Hazem Ibrahim, Matteo Varvello, Aaron Kaufman, Talal Rahwan, Yasir Zaki

Abstract: Large Language Models (LLMs) are a transformational technology, fundamentally changing how people obtain information and interact with the world. As people become increasingly reliant on them for an enormous variety of tasks, a body of academic research has developed to examine these models for inherent biases, especially political biases, often finding them small. We challenge this prevailing wis… ▽ More Large Language Models (LLMs) are a transformational technology, fundamentally changing how people obtain information and interact with the world. As people become increasingly reliant on them for an enormous variety of tasks, a body of academic research has developed to examine these models for inherent biases, especially political biases, often finding them small. We challenge this prevailing wisdom. First, by comparing 31 LLMs to legislators, judges, and a nationally representative sample of U.S. voters, we show that LLMs' apparently small overall partisan preference is the net result of offsetting extreme views on specific topics, much like moderate voters. Second, in a randomized experiment, we show that LLMs can promulgate their preferences into political persuasiveness even in information-seeking contexts: voters randomized to discuss political issues with an LLM chatbot are as much as 5 percentage points more likely to express the same preferences as that chatbot. Contrary to expectations, these persuasive effects are not moderated by familiarity with LLMs, news consumption, or interest in politics. LLMs, especially those controlled by private companies or governments, may become a powerful and targeted vector for political influence. △ Less

Submitted 7 May, 2025; originally announced May 2025.

Comments: 61 pages, 29 figures

arXiv:2504.03520 [pdf, other]

Neutralizing the Narrative: AI-Powered Debiasing of Online News Articles

Authors: Chen Wei Kuo, Kevin Chu, Nouar AlDahoul, Hazem Ibrahim, Talal Rahwan, Yasir Zaki

Abstract: Bias in news reporting significantly impacts public perception, particularly regarding crime, politics, and societal issues. Traditional bias detection methods, predominantly reliant on human moderation, suffer from subjective interpretations and scalability constraints. Here, we introduce an AI-driven framework leveraging advanced large language models (LLMs), specifically GPT-4o, GPT-4o Mini, Ge… ▽ More Bias in news reporting significantly impacts public perception, particularly regarding crime, politics, and societal issues. Traditional bias detection methods, predominantly reliant on human moderation, suffer from subjective interpretations and scalability constraints. Here, we introduce an AI-driven framework leveraging advanced large language models (LLMs), specifically GPT-4o, GPT-4o Mini, Gemini Pro, Gemini Flash, Llama 8B, and Llama 3B, to systematically identify and mitigate biases in news articles. To this end, we collect an extensive dataset consisting of over 30,000 crime-related articles from five politically diverse news sources spanning a decade (2013-2023). Our approach employs a two-stage methodology: (1) bias detection, where each LLM scores and justifies biased content at the paragraph level, validated through human evaluation for ground truth establishment, and (2) iterative debiasing using GPT-4o Mini, verified by both automated reassessment and human reviewers. Empirical results indicate GPT-4o Mini's superior accuracy in bias detection and effectiveness in debiasing. Furthermore, our analysis reveals temporal and geographical variations in media bias correlating with socio-political dynamics and real-world events. This study contributes to scalable computational methodologies for bias mitigation, promoting fairness and accountability in news reporting. △ Less

Submitted 4 April, 2025; originally announced April 2025.

Comments: 23 pages, 3 figures

arXiv:2502.08995 [pdf, other]

PixLift: Accelerating Web Browsing via AI Upscaling

Authors: Yonas Atinafu, Sarthak Malla, HyunSeok Daniel Jang, Nouar Aldahoul, Matteo Varvello, Yasir Zaki

Abstract: Accessing the internet in regions with expensive data plans and limited connectivity poses significant challenges, restricting information access and economic growth. Images, as a major contributor to webpage sizes, exacerbate this issue, despite advances in compression formats like WebP and AVIF. The continued growth of complex and curated web content, coupled with suboptimal optimization practic… ▽ More Accessing the internet in regions with expensive data plans and limited connectivity poses significant challenges, restricting information access and economic growth. Images, as a major contributor to webpage sizes, exacerbate this issue, despite advances in compression formats like WebP and AVIF. The continued growth of complex and curated web content, coupled with suboptimal optimization practices in many regions, has prevented meaningful reductions in web page sizes. This paper introduces PixLift, a novel solution to reduce webpage sizes by downscaling their images during transmission and leveraging AI models on user devices to upscale them. By trading computational resources for bandwidth, PixLift enables more affordable and inclusive web access. We address key challenges, including the feasibility of scaled image requests on popular websites, the implementation of PixLift as a browser extension, and its impact on user experience. Through the analysis of 71.4k webpages, evaluations of three mainstream upscaling models, and a user study, we demonstrate PixLift's ability to significantly reduce data usage without compromising image quality, fostering a more equitable internet. △ Less

Submitted 13 February, 2025; originally announced February 2025.

Comments: 9 pages, 2 figures

arXiv:2502.05698 [pdf]

A Conceptual Exploration of Generative AI-Induced Cognitive Dissonance and its Emergence in University-Level Academic Writing

Authors: Carl Errol Seran, Myles Joshua Toledo Tan, Hezerul Abdul Karim, Nouar AlDahoul

Abstract: The integration of Generative Artificial Intelligence (GenAI) into university-level academic writing presents both opportunities and challenges, particularly in relation to cognitive dissonance (CD). This work explores how GenAI serves as both a trigger and amplifier of CD, as students navigate ethical concerns, academic integrity, and self-efficacy in their writing practices. By synthesizing empi… ▽ More The integration of Generative Artificial Intelligence (GenAI) into university-level academic writing presents both opportunities and challenges, particularly in relation to cognitive dissonance (CD). This work explores how GenAI serves as both a trigger and amplifier of CD, as students navigate ethical concerns, academic integrity, and self-efficacy in their writing practices. By synthesizing empirical evidence and theoretical insights, we introduce a hypothetical construct of GenAI-induced CD, illustrating the psychological tension between AI-driven efficiency and the principles of originality, effort, and intellectual ownership. We further discuss strategies to mitigate this dissonance, including reflective pedagogy, AI literacy programs, transparency in GenAI use, and discipline-specific task redesigns. These approaches reinforce critical engagement with AI, fostering a balanced perspective that integrates technological advancements while safeguarding human creativity and learning. Our findings contribute to ongoing discussions on AI in education, self-regulated learning, and ethical AI use, offering a conceptual framework for institutions to develop guidelines that align AI adoption with academic values. △ Less

Submitted 8 February, 2025; originally announced February 2025.

Comments: 9 pages, 1 figure

arXiv:2501.17831 [pdf, other]

TikTok's recommendations skewed towards Republican content during the 2024 U.S. presidential race

Authors: Hazem Ibrahim, HyunSeok Daniel Jang, Nouar Aldahoul, Aaron R. Kaufman, Talal Rahwan, Yasir Zaki

Abstract: TikTok is a major force among social media platforms with over a billion monthly active users worldwide and 170 million in the United States. The platform's status as a key news source, particularly among younger demographics, raises concerns about its potential influence on politics in the U.S. and globally. Despite these concerns, there is scant research investigating TikTok's recommendation alg… ▽ More TikTok is a major force among social media platforms with over a billion monthly active users worldwide and 170 million in the United States. The platform's status as a key news source, particularly among younger demographics, raises concerns about its potential influence on politics in the U.S. and globally. Despite these concerns, there is scant research investigating TikTok's recommendation algorithm for political biases. We fill this gap by conducting 323 independent algorithmic audit experiments testing partisan content recommendations in the lead-up to the 2024 U.S. presidential elections. Specifically, we create hundreds of "sock puppet" TikTok accounts in Texas, New York, and Georgia, seeding them with varying partisan content and collecting algorithmic content recommendations for each of them. Collectively, these accounts viewed ~394,000 videos from April 30th to November 11th, 2024, which we label for political and partisan content. Our analysis reveals significant asymmetries in content distribution: Republican-seeded accounts received ~11.8% more party-aligned recommendations compared to their Democratic-seeded counterparts, and Democratic-seeded accounts were exposed to ~7.5% more opposite-party recommendations on average. These asymmetries exist across all three states and persist when accounting for video- and channel-level engagement metrics such as likes, views, shares, comments, and followers, and are driven primarily by negative partisanship content. Our findings provide insights into the inner workings of TikTok's recommendation algorithm during a critical election period, raising fundamental questions about platform neutrality. △ Less

Submitted 7 May, 2025; v1 submitted 29 January, 2025; originally announced January 2025.

Comments: 79 pages, 14 Figures, 35 Tables

arXiv:2412.14197 [pdf, other]

Advancing Vehicle Plate Recognition: Multitasking Visual Language Models with VehiclePaliGemma

Authors: Nouar AlDahoul, Myles Joshua Toledo Tan, Raghava Reddy Tera, Hezerul Abdul Karim, Chee How Lim, Manish Kumar Mishra, Yasir Zaki

Abstract: License plate recognition (LPR) involves automated systems that utilize cameras and computer vision to read vehicle license plates. Such plates collected through LPR can then be compared against databases to identify stolen vehicles, uninsured drivers, crime suspects, and more. The LPR system plays a significant role in saving time for institutions such as the police force. In the past, LPR relied… ▽ More License plate recognition (LPR) involves automated systems that utilize cameras and computer vision to read vehicle license plates. Such plates collected through LPR can then be compared against databases to identify stolen vehicles, uninsured drivers, crime suspects, and more. The LPR system plays a significant role in saving time for institutions such as the police force. In the past, LPR relied heavily on Optical Character Recognition (OCR), which has been widely explored to recognize characters in images. Usually, collected plate images suffer from various limitations, including noise, blurring, weather conditions, and close characters, making the recognition complex. Existing LPR methods still require significant improvement, especially for distorted images. To fill this gap, we propose utilizing visual language models (VLMs) such as OpenAI GPT4o, Google Gemini 1.5, Google PaliGemma (Pathways Language and Image model + Gemma model), Meta Llama 3.2, Anthropic Claude 3.5 Sonnet, LLaVA, NVIDIA VILA, and moondream2 to recognize such unclear plates with close characters. This paper evaluates the VLM's capability to address the aforementioned problems. Additionally, we introduce ``VehiclePaliGemma'', a fine-tuned Open-sourced PaliGemma VLM designed to recognize plates under challenging conditions. We compared our proposed VehiclePaliGemma with state-of-the-art methods and other VLMs using a dataset of Malaysian license plates collected under complex conditions. The results indicate that VehiclePaliGemma achieved superior performance with an accuracy of 87.6\%. Moreover, it is able to predict the car's plate at a speed of 7 frames per second using A100-80GB GPU. Finally, we explored the multitasking capability of VehiclePaliGemma model to accurately identify plates containing multiple cars of various models and colors, with plates positioned and oriented in different directions. △ Less

Submitted 14 December, 2024; originally announced December 2024.

Comments: 33 pages, 9 figures

arXiv:2411.17123 [pdf, other]

Advancing Content Moderation: Evaluating Large Language Models for Detecting Sensitive Content Across Text, Images, and Videos

Authors: Nouar AlDahoul, Myles Joshua Toledo Tan, Harishwar Reddy Kasireddy, Yasir Zaki

Abstract: The widespread dissemination of hate speech, harassment, harmful and sexual content, and violence across websites and media platforms presents substantial challenges and provokes widespread concern among different sectors of society. Governments, educators, and parents are often at odds with media platforms about how to regulate, control, and limit the spread of such content. Technologies for dete… ▽ More The widespread dissemination of hate speech, harassment, harmful and sexual content, and violence across websites and media platforms presents substantial challenges and provokes widespread concern among different sectors of society. Governments, educators, and parents are often at odds with media platforms about how to regulate, control, and limit the spread of such content. Technologies for detecting and censoring the media contents are a key solution to addressing these challenges. Techniques from natural language processing and computer vision have been used widely to automatically identify and filter out sensitive content such as offensive languages, violence, nudity, and addiction in both text, images, and videos, enabling platforms to enforce content policies at scale. However, existing methods still have limitations in achieving high detection accuracy with fewer false positives and false negatives. Therefore, more sophisticated algorithms for understanding the context of both text and image may open rooms for improvement in content censorship to build a more efficient censorship system. In this paper, we evaluate existing LLM-based content moderation solutions such as OpenAI moderation model and Llama-Guard3 and study their capabilities to detect sensitive contents. Additionally, we explore recent LLMs such as GPT, Gemini, and Llama in identifying inappropriate contents across media outlets. Various textual and visual datasets like X tweets, Amazon reviews, news articles, human photos, cartoons, sketches, and violence videos have been utilized for evaluation and comparison. The results demonstrate that LLMs outperform traditional techniques by achieving higher accuracy and lower false positive and false negative rates. This highlights the potential to integrate LLMs into websites, social media platforms, and video-sharing services for regulatory and content moderation purposes. △ Less

Submitted 26 November, 2024; originally announced November 2024.

Comments: 55 pages, 16 figures

arXiv:2411.02307 [pdf]

Can Personalized Medicine Coexist with Health Equity? Examining the Cost Barrier and Ethical Implications

Authors: Kishi Kobe Yee Francisco, Andrane Estelle Carnicer Apuhin, Myles Joshua Toledo Tan, Mickael Cavanaugh Byers, Nicholle Mae Amor Tan Maravilla, Hezerul Abdul Karim, Nouar AlDahoul

Abstract: Personalized medicine (PM) promises to transform healthcare by providing treatments tailored to individual genetic, environmental, and lifestyle factors. However, its high costs and infrastructure demands raise concerns about exacerbating health disparities, especially between high-income countries (HICs) and low- and middle-income countries (LMICs). While HICs benefit from advanced PM application… ▽ More Personalized medicine (PM) promises to transform healthcare by providing treatments tailored to individual genetic, environmental, and lifestyle factors. However, its high costs and infrastructure demands raise concerns about exacerbating health disparities, especially between high-income countries (HICs) and low- and middle-income countries (LMICs). While HICs benefit from advanced PM applications through AI and genomics, LMICs often lack the resources necessary to adopt these innovations, leading to a widening healthcare divide. This paper explores the financial and ethical challenges of PM implementation, with a focus on ensuring equitable access. It proposes strategies for global collaboration, infrastructure development, and ethical frameworks to support LMICs in adopting PM, aiming to prevent further disparities in healthcare accessibility and outcomes. △ Less

Submitted 4 November, 2024; originally announced November 2024.

Comments: 30 pages, 1 figure

arXiv:2410.24148 [pdf, other]

Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age

Authors: Nouar AlDahoul, Myles Joshua Toledo Tan, Harishwar Reddy Kasireddy, Yasir Zaki

Abstract: Technologies for recognizing facial attributes like race, gender, age, and emotion have several applications, such as surveillance, advertising content, sentiment analysis, and the study of demographic trends and social behaviors. Analyzing demographic characteristics based on images and analyzing facial expressions have several challenges due to the complexity of humans' facial attributes. Tradit… ▽ More Technologies for recognizing facial attributes like race, gender, age, and emotion have several applications, such as surveillance, advertising content, sentiment analysis, and the study of demographic trends and social behaviors. Analyzing demographic characteristics based on images and analyzing facial expressions have several challenges due to the complexity of humans' facial attributes. Traditional approaches have employed CNNs and various other deep learning techniques, trained on extensive collections of labeled images. While these methods demonstrated effective performance, there remains potential for further enhancements. In this paper, we propose to utilize vision language models (VLMs) such as generative pre-trained transformer (GPT), GEMINI, large language and vision assistant (LLAVA), PaliGemma, and Microsoft Florence2 to recognize facial attributes such as race, gender, age, and emotion from images with human faces. Various datasets like FairFace, AffectNet, and UTKFace have been utilized to evaluate the solutions. The results show that VLMs are competitive if not superior to traditional techniques. Additionally, we propose "FaceScanPaliGemma"--a fine-tuned PaliGemma model--for race, gender, age, and emotion recognition. The results show an accuracy of 81.1%, 95.8%, 80%, and 59.4% for race, gender, age group, and emotion classification, respectively, outperforming pre-trained version of PaliGemma, other VLMs, and SotA methods. Finally, we propose "FaceScanGPT", which is a GPT-4o model to recognize the above attributes when several individuals are present in the image using a prompt engineered for a person with specific facial and/or physical attributes. The results underscore the superior multitasking capability of FaceScanGPT to detect the individual's attributes like hair cut, clothing color, postures, etc., using only a prompt to drive the detection and recognition tasks. △ Less

Submitted 31 October, 2024; originally announced October 2024.

Comments: 52 pages, 13 figures

arXiv:2410.21898 [pdf, other]

A Longitudinal Analysis of Racial and Gender Bias in New York Times and Fox News Images and Articles

Authors: Hazem Ibrahim, Nouar AlDahoul, Syed Mustafa Ali Abbasi, Fareed Zaffar, Talal Rahwan, Yasir Zaki

Abstract: The manner in which different racial and gender groups are portrayed in news coverage plays a large role in shaping public opinion. As such, understanding how such groups are portrayed in news media is of notable societal value, and has thus been a significant endeavour in both the computer and social sciences. Yet, the literature still lacks a longitudinal study examining both the frequency of ap… ▽ More The manner in which different racial and gender groups are portrayed in news coverage plays a large role in shaping public opinion. As such, understanding how such groups are portrayed in news media is of notable societal value, and has thus been a significant endeavour in both the computer and social sciences. Yet, the literature still lacks a longitudinal study examining both the frequency of appearance of different racial and gender groups in online news articles, as well as the context in which such groups are discussed. To fill this gap, we propose two machine learning classifiers to detect the race and age of a given subject. Next, we compile a dataset of 123,337 images and 441,321 online news articles from New York Times (NYT) and Fox News (Fox), and examine representation through two computational approaches. Firstly, we examine the frequency and prominence of appearance of racial and gender groups in images embedded in news articles, revealing that racial and gender minorities are largely under-represented, and when they do appear, they are featured less prominently compared to majority groups. Furthermore, we find that NYT largely features more images of racial minority groups compared to Fox. Secondly, we examine both the frequency and context with which racial minority groups are presented in article text. This reveals the narrow scope in which certain racial groups are covered and the frequency with which different groups are presented as victims and/or perpetrators in a given conflict. Taken together, our analysis contributes to the literature by providing two novel open-source classifiers to detect race and age from images, and shedding light on the racial and gender biases in news articles from venues on opposite ends of the American political spectrum. △ Less

Submitted 31 October, 2024; v1 submitted 29 October, 2024; originally announced October 2024.

Comments: 13 pages, and 11 figures

arXiv:2409.15361 [pdf, other]

Multitask Mayhem: Unveiling and Mitigating Safety Gaps in LLMs Fine-tuning

Authors: Essa Jan, Nouar AlDahoul, Moiz Ali, Faizan Ahmad, Fareed Zaffar, Yasir Zaki

Abstract: Recent breakthroughs in Large Language Models (LLMs) have led to their adoption across a wide range of tasks, ranging from code generation to machine translation and sentiment analysis, etc. Red teaming/Safety alignment efforts show that fine-tuning models on benign (non-harmful) data could compromise safety. However, it remains unclear to what extent this phenomenon is influenced by different var… ▽ More Recent breakthroughs in Large Language Models (LLMs) have led to their adoption across a wide range of tasks, ranging from code generation to machine translation and sentiment analysis, etc. Red teaming/Safety alignment efforts show that fine-tuning models on benign (non-harmful) data could compromise safety. However, it remains unclear to what extent this phenomenon is influenced by different variables, including fine-tuning task, model calibrations, etc. This paper explores the task-wise safety degradation due to fine-tuning on downstream tasks such as summarization, code generation, translation, and classification across various calibration. Our results reveal that: 1) Fine-tuning LLMs for code generation and translation leads to the highest degradation in safety guardrails. 2) LLMs generally have weaker guardrails for translation and classification, with 73-92% of harmful prompts answered, across baseline and other calibrations, falling into one of two concern categories. 3) Current solutions, including guards and safety tuning datasets, lack cross-task robustness. To address these issues, we developed a new multitask safety dataset effectively reducing attack success rates across a range of tasks without compromising the model's overall helpfulness. Our work underscores the need for generalized alignment measures to ensure safer and more robust models. △ Less

Submitted 18 September, 2024; originally announced September 2024.

Comments: 19 pages, 11 figures

arXiv:2406.10400 [pdf, other]

Self-Reflection Makes Large Language Models Safer, Less Biased, and Ideologically Neutral

Authors: Fengyuan Liu, Nouar AlDahoul, Gregory Eady, Yasir Zaki, Talal Rahwan

Abstract: Previous studies proposed that the reasoning capabilities of large language models (LLMs) can be improved through self-reflection, i.e., letting LLMs reflect on their own output to identify and correct mistakes in the initial responses. However, earlier experiments offer mixed results when it comes to the benefits of self-reflection. Furthermore, prior studies on self-reflection are predominantly… ▽ More Previous studies proposed that the reasoning capabilities of large language models (LLMs) can be improved through self-reflection, i.e., letting LLMs reflect on their own output to identify and correct mistakes in the initial responses. However, earlier experiments offer mixed results when it comes to the benefits of self-reflection. Furthermore, prior studies on self-reflection are predominantly concerned with the reasoning capabilities of models, ignoring the potential for self-reflection in safety, bias, and ideological leaning. Here, by conducting a series of experiments testing LLM's self-reflection capability in various tasks using a variety of prompts and different LLMs, we make several contributions to the literature. First, we reconcile conflicting findings regarding the benefit of self-reflection, by demonstrating that the outcome of self-reflection is sensitive to prompt wording -- both the original prompt that are used to elicit an initial answer and the subsequent prompt used to self-reflect. Specifically, although self-reflection may improve the reasoning capability of LLMs when the initial response is simple, the technique cannot improve upon the state-of-the-art chain-of-thought (CoT) prompting. Second, we show that self-reflection can lead to safer (75.8\% reduction in toxic responses while preserving 97.8\% non-toxic ones), less biased (77\% reduction in gender biased responses, while preserving 94.3\% unbiased ones), and more ideologically neutral responses (100\% reduction in partisan leaning response, while preserving 87.7\% non-partisan ones). The paper concludes by discussing the implications of our findings on the deployment of large language models. We release our experiments at https://github.com/Michael98Liu/self-reflection. △ Less

Submitted 16 February, 2025; v1 submitted 14 June, 2024; originally announced June 2024.

arXiv:2405.06404 [pdf, other]

Inclusive content reduces racial and gender biases, yet non-inclusive content dominates popular culture

Authors: Nouar AlDahoul, Hazem Ibrahim, Minsu Park, Talal Rahwan, Yasir Zaki

Abstract: Images are often termed as representations of perceived reality. As such, racial and gender biases in popular culture and visual media could play a critical role in shaping people's perceptions of society. While previous research has made significant progress in exploring the frequency and discrepancies in racial and gender group appearances in visual media, it has largely overlooked important nua… ▽ More Images are often termed as representations of perceived reality. As such, racial and gender biases in popular culture and visual media could play a critical role in shaping people's perceptions of society. While previous research has made significant progress in exploring the frequency and discrepancies in racial and gender group appearances in visual media, it has largely overlooked important nuances in how these groups are portrayed, as it lacked the ability to systematically capture such complexities at scale over time. To address this gap, we examine two media forms of varying target audiences, namely fashion magazines and movie posters. Accordingly, we collect a large dataset comprising over 300,000 images spanning over five decades and utilize state-of-the-art machine learning models to classify not only race and gender but also the posture, expressed emotional state, and body composition of individuals featured in each image. We find that racial minorities appear far less frequently than their White counterparts, and when they do appear, they are portrayed less prominently. We also find that women are more likely to be portrayed with their full bodies, whereas men are more frequently presented with their faces. Finally, through a series of survey experiments, we find evidence that exposure to inclusive content can help reduce biases in perceptions of minorities, while racially and gender-homogenized content may reinforce and amplify such biases. Taken together, our findings highlight that racial and gender biases in visual media remain pervasive, potentially exacerbating existing stereotypes and inequalities. △ Less

Submitted 19 November, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

Comments: 86 pages, 17 figures

arXiv:2404.04261 [pdf, other]

A Novel BERT-based Classifier to Detect Political Leaning of YouTube Videos based on their Titles

Authors: Nouar AlDahoul, Talal Rahwan, Yasir Zaki

Abstract: A quarter of US adults regularly get their news from YouTube. Yet, despite the massive political content available on the platform, to date no classifier has been proposed to identify the political leaning of YouTube videos. To fill this gap, we propose a novel classifier based on Bert -- a language model from Google -- to classify YouTube videos merely based on their titles into six categories, n… ▽ More A quarter of US adults regularly get their news from YouTube. Yet, despite the massive political content available on the platform, to date no classifier has been proposed to identify the political leaning of YouTube videos. To fill this gap, we propose a novel classifier based on Bert -- a language model from Google -- to classify YouTube videos merely based on their titles into six categories, namely: Far Left, Left, Center, Anti-Woke, Right, and Far Right. We used a public dataset of 10 million YouTube video titles (under various categories) to train and validate the proposed classifier. We compare the classifier against several alternatives that we trained on the same dataset, revealing that our classifier achieves the highest accuracy (75%) and the highest F1 score (77%). To further validate the classification performance, we collect videos from YouTube channels of numerous prominent news agencies, such as Fox News and New York Times, which have widely known political leanings, and apply our classifier to their video titles. For the vast majority of cases, the predicted political leaning matches that of the news agency. △ Less

Submitted 16 February, 2024; originally announced April 2024.

Comments: 14 pages, 4 figures

arXiv:2402.01002 [pdf, other]

AI-generated faces influence gender stereotypes and racial homogenization

Authors: Nouar AlDahoul, Talal Rahwan, Yasir Zaki

Abstract: Text-to-image generative AI models such as Stable Diffusion are used daily by millions worldwide. However, the extent to which these models exhibit racial and gender stereotypes is not yet fully understood. Here, we document significant biases in Stable Diffusion across six races, two genders, 32 professions, and eight attributes. Additionally, we examine the degree to which Stable Diffusion depic… ▽ More Text-to-image generative AI models such as Stable Diffusion are used daily by millions worldwide. However, the extent to which these models exhibit racial and gender stereotypes is not yet fully understood. Here, we document significant biases in Stable Diffusion across six races, two genders, 32 professions, and eight attributes. Additionally, we examine the degree to which Stable Diffusion depicts individuals of the same race as being similar to one another. This analysis reveals significant racial homogenization, e.g., depicting nearly all Middle Eastern men as bearded, brown-skinned, and wearing traditional attire. We then propose debiasing solutions that allow users to specify the desired distributions of race and gender when generating images while minimizing racial homogenization. Finally, using a preregistered survey experiment, we find evidence that being presented with inclusive AI-generated faces reduces people's racial and gender biases, while being presented with non-inclusive ones increases such biases, regardless of whether the images are labeled as AI-generated. Taken together, our findings emphasize the need to address biases and stereotypes in text-to-image models. △ Less

Submitted 21 November, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

Comments: 47 pages, 19 figures

arXiv:2310.17370 [pdf, other]

Exploring the Potential of Generative AI for the World Wide Web

Authors: Nouar AlDahoul, Joseph Hong, Matteo Varvello, Yasir Zaki

Abstract: Generative Artificial Intelligence (AI) is a cutting-edge technology capable of producing text, images, and various media content leveraging generative models and user prompts. Between 2022 and 2023, generative AI surged in popularity with a plethora of applications spanning from AI-powered movies to chatbots. In this paper, we delve into the potential of generative AI within the realm of the Worl… ▽ More Generative Artificial Intelligence (AI) is a cutting-edge technology capable of producing text, images, and various media content leveraging generative models and user prompts. Between 2022 and 2023, generative AI surged in popularity with a plethora of applications spanning from AI-powered movies to chatbots. In this paper, we delve into the potential of generative AI within the realm of the World Wide Web, specifically focusing on image generation. Web developers already harness generative AI to help crafting text and images, while Web browsers might use it in the future to locally generate images for tasks like repairing broken webpages, conserving bandwidth, and enhancing privacy. To explore this research area, we have developed WebDiffusion, a tool that allows to simulate a Web powered by stable diffusion, a popular text-to-image model, from both a client and server perspective. WebDiffusion further supports crowdsourcing of user opinions, which we use to evaluate the quality and accuracy of 409 AI-generated images sourced from 60 webpages. Our findings suggest that generative AI is already capable of producing pertinent and high-quality Web images, even without requiring Web designers to manually input prompts, just by leveraging contextual information available within the webpages. However, we acknowledge that direct in-browser image generation remains a challenge, as only highly powerful GPUs, such as the A40 and A100, can (partially) compete with classic image downloads. Nevertheless, this approach could be valuable for a subset of the images, for example when fixing broken webpages or handling highly private content. △ Less

Submitted 26 October, 2023; originally announced October 2023.

Comments: 11 pages, 9 figures

arXiv:2208.01963 [pdf]

Localization and Classification of Parasitic Eggs in Microscopic Images Using an EfficientDet Detector

Authors: Nouar AlDahoul, Hezerul Abdul Karim, Shaira Limson Kee, Myles Joshua Toledo Tan

Abstract: IPIs caused by protozoan and helminth parasites are among the most common infections in humans in LMICs. They are regarded as a severe public health concern, as they cause a wide array of potentially detrimental health conditions. Researchers have been developing pattern recognition techniques for the automatic identification of parasite eggs in microscopic images. Existing solutions still need im… ▽ More IPIs caused by protozoan and helminth parasites are among the most common infections in humans in LMICs. They are regarded as a severe public health concern, as they cause a wide array of potentially detrimental health conditions. Researchers have been developing pattern recognition techniques for the automatic identification of parasite eggs in microscopic images. Existing solutions still need improvements to reduce diagnostic errors and generate fast, efficient, and accurate results. Our paper addresses this and proposes a multi-modal learning detector to localize parasitic eggs and categorize them into 11 categories. The experiments were conducted on the novel Chula-ParasiteEgg-11 dataset that was used to train both EfficientDet model with EfficientNet-v2 backbone and EfficientNet-B7+SVM. The dataset has 11,000 microscopic training images from 11 categories. Our results show robust performance with an accuracy of 92%, and an F1 score of 93%. Additionally, the IOU distribution illustrates the high localization capability of the detector. △ Less

Submitted 3 August, 2022; originally announced August 2022.

Comments: 6 pages, 7 figures, to be published in IEEE International Conference on Image Processing 2022

ACM Class: I.2.1; I.4.5; I.4.9; I.5.4; J.3

Showing 1–18 of 18 results for author: AlDahoul, N