Skip to main content

Showing 1–15 of 15 results for author: Al-Khalifa, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2509.21466  [pdf

    cs.CV cs.AI cs.CL

    Gender Stereotypes in Professional Roles Among Saudis: An Analytical Study of AI-Generated Images Using Language Models

    Authors: Khaloud S. AlKhalifah, Malak Mashaabi, Hend Al-Khalifa

    Abstract: This study investigates the extent to which contemporary Text-to-Image artificial intelligence (AI) models perpetuate gender stereotypes and cultural inaccuracies when generating depictions of professionals in Saudi Arabia. We analyzed 1,006 images produced by ImageFX, DALL-E V3, and Grok for 56 diverse Saudi professions using neutral prompts. Two trained Saudi annotators evaluated each image on f… ▽ More

    Submitted 25 September, 2025; originally announced September 2025.

  2. arXiv:2508.15830  [pdf, ps, other

    cs.CL cs.AI

    DAIQ: Auditing Demographic Attribute Inference from Question in LLMs

    Authors: Srikant Panda, Hitesh Laxmichand Patel, Shahad Al-Khalifa, Amit Agarwal, Hend Al-Khalifa, Sharefah Al-Ghamdi

    Abstract: Large Language Models (LLMs) are known to reflect social biases when demographic attributes, such as gender or race, are explicitly present in the input. But even in their absence, these models still infer user identities based solely on question phrasing. This subtle behavior has received far less attention, yet poses serious risks: it violates expectations of neutrality, infers unintended demogr… ▽ More

    Submitted 18 August, 2025; originally announced August 2025.

    Comments: Preprint

  3. arXiv:2508.14869  [pdf

    q-bio.NC cs.CL

    The Prompting Brain: Neurocognitive Markers of Expertise in Guiding Large Language Models

    Authors: Hend Al-Khalifa, Raneem Almansour, Layan Abdulrahman Alhuasini, Alanood Alsaleh, Mohamad-Hani Temsah, Mohamad-Hani_Temsah, Ashwag Rafea S Alruwaili

    Abstract: Prompt engineering has rapidly emerged as a critical skill for effective interaction with large language models (LLMs). However, the cognitive and neural underpinnings of this expertise remain largely unexplored. This paper presents findings from a cross-sectional pilot fMRI study investigating differences in brain functional connectivity and network activity between experts and intermediate promp… ▽ More

    Submitted 20 August, 2025; originally announced August 2025.

  4. arXiv:2506.01340  [pdf, ps, other

    cs.CL

    The Landscape of Arabic Large Language Models (ALLMs): A New Era for Arabic Language Technology

    Authors: Shahad Al-Khalifa, Nadir Durrani, Hend Al-Khalifa, Firoj Alam

    Abstract: The emergence of ChatGPT marked a transformative milestone for Artificial Intelligence (AI), showcasing the remarkable potential of Large Language Models (LLMs) to generate human-like text. This wave of innovation has revolutionized how we interact with technology, seamlessly integrating LLMs into everyday tasks such as vacation planning, email drafting, and content creation. While English-speakin… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: Accepted at CACM

  5. arXiv:2502.08319  [pdf

    cs.CL

    MultiProSE: A Multi-label Arabic Dataset for Propaganda, Sentiment, and Emotion Detection

    Authors: Lubna Al-Henaki, Hend Al-Khalifa, Abdulmalik Al-Salman, Hajar Alqubayshi, Hind Al-Twailay, Gheeda Alghamdi, Hawra Aljasim

    Abstract: Propaganda is a form of persuasion that has been used throughout history with the intention goal of influencing people's opinions through rhetorical and psychological persuasion techniques for determined ends. Although Arabic ranked as the fourth most-used language on the internet, resources for propaganda detection in languages other than English, especially Arabic, remain extremely limited. To a… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

    Comments: 12 pages, 3 figuers, 4 tabels

  6. arXiv:2412.15259  [pdf, other

    cs.CL cs.SI

    GLARE: Google Apps Arabic Reviews Dataset

    Authors: Fatima AlGhamdi, Reem Mohammed, Hend Al-Khalifa, Areeb Alowisheq

    Abstract: This paper introduces GLARE an Arabic Apps Reviews dataset collected from Saudi Google PlayStore. It consists of 76M reviews, 69M of which are Arabic reviews of 9,980 Android Applications. We present the data collection methodology, along with a detailed Exploratory Data Analysis (EDA) and Feature Engineering on the gathered reviews. We also highlight possible use cases and benefits of the dataset… ▽ More

    Submitted 16 December, 2024; originally announced December 2024.

    Comments: Github Repo: https://github.com/Fatima-Gh/GLARE Zenodo: https://zenodo.org/records/6457824

  7. arXiv:2410.20238  [pdf

    cs.CL cs.AI

    A Survey of Large Language Models for Arabic Language and its Dialects

    Authors: Malak Mashaabi, Shahad Al-Khalifa, Hend Al-Khalifa

    Abstract: This survey offers a comprehensive overview of Large Language Models (LLMs) designed for Arabic language and its dialects. It covers key architectures, including encoder-only, decoder-only, and encoder-decoder models, along with the datasets used for pre-training, spanning Classical Arabic, Modern Standard Arabic, and Dialectal Arabic. The study also explores monolingual, bilingual, and multilingu… ▽ More

    Submitted 24 February, 2025; v1 submitted 26 October, 2024; originally announced October 2024.

    Comments: Submitted to ACM Transactions on Asian and Low-Resource Language Information Processing

  8. arXiv:2408.12362  [pdf

    cs.CL

    CLEANANERCorp: Identifying and Correcting Incorrect Labels in the ANERcorp Dataset

    Authors: Mashael Al-Duwais, Hend Al-Khalifa, Abdulmalik Al-Salman

    Abstract: Label errors are a common issue in machine learning datasets, particularly for tasks such as Named Entity Recognition. Such label errors might hurt model training, affect evaluation results, and lead to an inaccurate assessment of model performance. In this study, we dived deep into one of the widely adopted Arabic NER benchmark datasets (ANERcorp) and found a significant number of annotation erro… ▽ More

    Submitted 8 March, 2025; v1 submitted 22 August, 2024; originally announced August 2024.

    Comments: Proceedings of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools (OSACT) with Shared Tasks on Arabic LLMs Hallucination and Dialect to MSA Machine Translation @ LREC-COLING 2024

    Journal ref: ELRA and ICCL 2024

  9. arXiv:2407.00146  [pdf

    cs.CL cs.AI

    The Qiyas Benchmark: Measuring ChatGPT Mathematical and Language Understanding in Arabic

    Authors: Shahad Al-Khalifa, Hend Al-Khalifa

    Abstract: Despite the growing importance of Arabic as a global language, there is a notable lack of language models pre-trained exclusively on Arabic data. This shortage has led to limited benchmarks available for assessing language model performance in Arabic. To address this gap, we introduce two novel benchmarks designed to evaluate models' mathematical reasoning and language understanding abilities in A… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  10. arXiv:2304.09866  [pdf

    cs.HC cs.AI

    Towards Designing a ChatGPT Conversational Companion for Elderly People

    Authors: Abeer Alessa, Hend Al-Khalifa

    Abstract: Loneliness and social isolation are serious and widespread problems among older people, affecting their physical and mental health, quality of life, and longevity. In this paper, we propose a ChatGPT-based conversational companion system for elderly people. The system is designed to provide companionship and help reduce feelings of loneliness and social isolation. The system was evaluated with a p… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

    Comments: 10 pages, 3 Figures, Workshop paper

  11. arXiv:2304.02757  [pdf

    cs.CL cs.AI

    The Saudi Privacy Policy Dataset

    Authors: Hend Al-Khalifa, Malak Mashaabi, Ghadi Al-Yahya, Raghad Alnashwan

    Abstract: This paper introduces the Saudi Privacy Policy Dataset, a diverse compilation of Arabic privacy policies from various sectors in Saudi Arabia, annotated according to the 10 principles of the Personal Data Protection Law (PDPL); the PDPL was established to be compatible with General Data Protection Regulation (GDPR); one of the most comprehensive data regulations worldwide. Data were collected from… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: 8 pages, 1 figure

  12. arXiv:2212.09523  [pdf

    cs.CL cs.AI

    Natural Language Processing in Customer Service: A Systematic Review

    Authors: Malak Mashaabi, Areej Alotaibi, Hala Qudaih, Raghad Alnashwan, Hend Al-Khalifa

    Abstract: Artificial intelligence and natural language processing (NLP) are increasingly being used in customer service to interact with users and answer their questions. The goal of this systematic review is to examine existing research on the use of NLP technology in customer service, including the research domain, applications, datasets used, and evaluation methods. The review also looks at the future di… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

  13. arXiv:2211.02119  [pdf

    cs.CV cs.AI

    Handwritten Arabic Character Recognition for Children Writ-ing Using Convolutional Neural Network and Stroke Identification

    Authors: Mais Alheraki, Rawan Al-Matham, Hend Al-Khalifa

    Abstract: Automatic Arabic handwritten recognition is one of the recently studied problems in the field of Machine Learning. Unlike Latin languages, Arabic is a Semitic language that forms a harder challenge, especially with variability of patterns caused by factors such as writer age. Most of the studies focused on adults, with only one recent study on children. Moreover, much of the recent Machine Learnin… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: 17

  14. arXiv:2011.12631  [pdf, ps, other

    cs.CL

    A Panoramic Survey of Natural Language Processing in the Arab World

    Authors: Kareem Darwish, Nizar Habash, Mourad Abbas, Hend Al-Khalifa, Huseein T. Al-Natsheh, Samhaa R. El-Beltagy, Houda Bouamor, Karim Bouzoubaa, Violetta Cavalli-Sforza, Wassim El-Hajj, Mustafa Jarrar, Hamdy Mubarak

    Abstract: The term natural language refers to any system of symbolic communication (spoken, signed or written) without intentional human planning and design. This distinguishes natural languages such as Arabic and Japanese from artificially constructed languages such as Esperanto or Python. Natural language processing (NLP) is the sub-field of artificial intelligence (AI) focused on modeling natural languag… ▽ More

    Submitted 27 September, 2021; v1 submitted 25 November, 2020; originally announced November 2020.

  15. arXiv:1805.08533  [pdf

    cs.CL

    Sentiment Analysis of Arabic Tweets: Feature Engineering and A Hybrid Approach

    Authors: Nora Al-Twairesh, Hend Al-Khalifa, AbdulMalik Alsalman, Yousef Al-Ohali

    Abstract: Sentiment Analysis in Arabic is a challenging task due to the rich morphology of the language. Moreover, the task is further complicated when applied to Twitter data that is known to be highly informal and noisy. In this paper, we develop a hybrid method for sentiment analysis for Arabic tweets for a specific Arabic dialect which is the Saudi Dialect. Several features were engineered and evaluated… ▽ More

    Submitted 22 May, 2018; originally announced May 2018.