Skip to main content

Showing 1–14 of 14 results for author: Davani, A M

.
  1. arXiv:2501.01056  [pdf, other

    cs.CL cs.AI

    Risks of Cultural Erasure in Large Language Models

    Authors: Rida Qadri, Aida M. Davani, Kevin Robinson, Vinodkumar Prabhakaran

    Abstract: Large language models are increasingly being integrated into applications that shape the production and discovery of societal knowledge such as search, online education, and travel planning. As a result, language models will shape how people learn about, perceive and interact with global cultures making it important to consider whose knowledge systems and perspectives are represented in models. Re… ▽ More

    Submitted 1 January, 2025; originally announced January 2025.

  2. arXiv:2410.17032  [pdf, other

    cs.AI

    Insights on Disagreement Patterns in Multimodal Safety Perception across Diverse Rater Groups

    Authors: Charvi Rastogi, Tian Huey Teh, Pushkar Mishra, Roma Patel, Zoe Ashwood, Aida Mostafazadeh Davani, Mark Diaz, Michela Paganini, Alicia Parrish, Ding Wang, Vinodkumar Prabhakaran, Lora Aroyo, Verena Rieser

    Abstract: AI systems crucially rely on human ratings, but these ratings are often aggregated, obscuring the inherent diversity of perspectives in real-world phenomenon. This is particularly concerning when evaluating the safety of generative AI, where perceptions and associated harms can vary significantly across socio-cultural contexts. While recent research has studied the impact of demographic difference… ▽ More

    Submitted 22 October, 2024; originally announced October 2024.

    Comments: 20 pages, 7 figures

  3. arXiv:2404.10857  [pdf, other

    cs.CL

    D3CODE: Disentangling Disagreements in Data across Cultures on Offensiveness Detection and Evaluation

    Authors: Aida Mostafazadeh Davani, Mark Díaz, Dylan Baker, Vinodkumar Prabhakaran

    Abstract: While human annotations play a crucial role in language technologies, annotator subjectivity has long been overlooked in data collection. Recent studies that have critically examined this issue are often situated in the Western context, and solely document differences across age, gender, or racial groups. As a result, NLP research on subjectivity have overlooked the fact that individuals within de… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  4. arXiv:2404.05866  [pdf, other

    cs.CL

    GeniL: A Multilingual Dataset on Generalizing Language

    Authors: Aida Mostafazadeh Davani, Sagar Gubbi, Sunipa Dev, Shachi Dave, Vinodkumar Prabhakaran

    Abstract: Generative language models are transforming our digital ecosystem, but they often inherit societal biases, for instance stereotypes associating certain attributes with specific identity groups. While whether and how these biases are mitigated may depend on the specific use cases, being able to effectively detect instances of stereotype perpetuation is a crucial first step. Current methods to asses… ▽ More

    Submitted 9 August, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  5. arXiv:2311.05074  [pdf, other

    cs.CL cs.AI

    GRASP: A Disagreement Analysis Framework to Assess Group Associations in Perspectives

    Authors: Vinodkumar Prabhakaran, Christopher Homan, Lora Aroyo, Aida Mostafazadeh Davani, Alicia Parrish, Alex Taylor, Mark Díaz, Ding Wang, Gregory Serapio-García

    Abstract: Human annotation plays a core role in machine learning -- annotations for supervised models, safety guardrails for generative models, and human feedback for reinforcement learning, to cite a few avenues. However, the fact that many of these human annotations are inherently subjective is often overlooked. Recent work has demonstrated that ignoring rater subjectivity (typically resulting in rater di… ▽ More

    Submitted 13 June, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: Presented as a long paper at NAACL 2024 main conference

    Journal ref: 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics

  6. arXiv:2208.05545  [pdf, other

    cs.CL cs.CY cs.LG

    The Moral Foundations Reddit Corpus

    Authors: Jackson Trager, Alireza S. Ziabari, Aida Mostafazadeh Davani, Preni Golazizian, Farzan Karimi-Malekabadi, Ali Omrani, Zhihe Li, Brendan Kennedy, Nils Karl Reimer, Melissa Reyes, Kelsey Cheng, Mellow Wei, Christina Merrifield, Arta Khosravi, Evans Alvarez, Morteza Dehghani

    Abstract: Moral framing and sentiment can affect a variety of online and offline behaviors, including donation, pro-environmental action, political engagement, and even participation in violent protests. Various computational methods in Natural Language Processing (NLP) have been used to detect moral sentiment from textual data, but in order to achieve better performances in such subjective tasks, large set… ▽ More

    Submitted 17 August, 2022; v1 submitted 10 August, 2022; originally announced August 2022.

  7. arXiv:2110.14839  [pdf, other

    cs.CL cs.CY

    Hate Speech Classifiers Learn Human-Like Social Stereotypes

    Authors: Aida Mostafazadeh Davani, Mohammad Atari, Brendan Kennedy, Morteza Dehghani

    Abstract: Social stereotypes negatively impact individuals' judgements about different groups and may have a critical role in how people understand language directed toward minority social groups. Here, we assess the role of social stereotypes in the automated detection of hateful language by examining the relation between individual annotator biases and erroneous classification of texts by hate speech clas… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

  8. arXiv:2110.05719  [pdf, other

    cs.CL cs.CY

    Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations

    Authors: Aida Mostafazadeh Davani, Mark Díaz, Vinodkumar Prabhakaran

    Abstract: Majority voting and averaging are common approaches employed to resolve annotator disagreements and derive single ground truth labels from multiple annotations. However, annotators may systematically disagree with one another, often reflecting their individual biases and values, especially in the case of subjective tasks such as detecting affect, aggression, and hate speech. Annotator disagreement… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

  9. arXiv:2110.05699  [pdf, other

    cs.CL cs.CY

    On Releasing Annotator-Level Labels and Information in Datasets

    Authors: Vinodkumar Prabhakaran, Aida Mostafazadeh Davani, Mark Díaz

    Abstract: A common practice in building NLP datasets, especially using crowd-sourced annotations, involves obtaining multiple annotator judgements on the same data instances, which are then flattened to produce a single "ground truth" label or score, through majority voting, averaging, or adjudication. While these approaches may be appropriate in certain annotation tasks, such aggregations overlook the soci… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

  10. arXiv:2108.01721  [pdf, other

    cs.CL cs.CY

    Improving Counterfactual Generation for Fair Hate Speech Detection

    Authors: Aida Mostafazadeh Davani, Ali Omrani, Brendan Kennedy, Mohammad Atari, Xiang Ren, Morteza Dehghani

    Abstract: Bias mitigation approaches reduce models' dependence on sensitive features of data, such as social group tokens (SGTs), resulting in equal predictions across the sensitive features. In hate speech detection, however, equalizing model predictions may ignore important differences among targeted social groups, as hate speech can contain stereotypical language specific to each SGT. Here, to take the s… ▽ More

    Submitted 3 August, 2021; originally announced August 2021.

    Comments: 4-page paper Workshop on Online Abuse and Harms (ACL 2021)

  11. arXiv:2010.12864  [pdf, other

    cs.CL stat.ML

    On Transferability of Bias Mitigation Effects in Language Model Fine-Tuning

    Authors: Xisen Jin, Francesco Barbieri, Brendan Kennedy, Aida Mostafazadeh Davani, Leonardo Neves, Xiang Ren

    Abstract: Fine-tuned language models have been shown to exhibit biases against protected groups in a host of modeling tasks such as text classification and coreference resolution. Previous works focus on detecting these biases, reducing bias in data representations, and using auxiliary training objectives to mitigate bias during fine-tuning. Although these techniques achieve bias reduction for the task and… ▽ More

    Submitted 11 April, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: 14 pages; Accepted at NAACL 2021

  12. arXiv:2010.12779  [pdf, other

    cs.CL

    Fair Hate Speech Detection through Evaluation of Social Group Counterfactuals

    Authors: Aida Mostafazadeh Davani, Ali Omrani, Brendan Kennedy, Mohammad Atari, Xiang Ren, Morteza Dehghani

    Abstract: Approaches for mitigating bias in supervised models are designed to reduce models' dependence on specific sensitive features of the input data, e.g., mentioned social groups. However, in the case of hate speech detection, it is not always desirable to equalize the effects of social groups because of their essential role in distinguishing outgroup-derogatory hate, such that particular types of hate… ▽ More

    Submitted 24 October, 2020; originally announced October 2020.

  13. arXiv:2005.02439  [pdf, other

    cs.CL cs.IR cs.LG

    Contextualizing Hate Speech Classifiers with Post-hoc Explanation

    Authors: Brendan Kennedy, Xisen Jin, Aida Mostafazadeh Davani, Morteza Dehghani, Xiang Ren

    Abstract: Hate speech classifiers trained on imbalanced datasets struggle to determine if group identifiers like "gay" or "black" are used in offensive or prejudiced ways. Such biases manifest in false positives when these identifiers are present, due to models' inability to learn the contexts which constitute a hateful usage of identifiers. We extract SOC post-hoc explanations from fine-tuned BERT classifi… ▽ More

    Submitted 6 July, 2020; v1 submitted 5 May, 2020; originally announced May 2020.

    Comments: To appear in Proceedings of the 2020 Annual Conference of the Association for Computational Linguistics; Updated references and discussions

  14. arXiv:1909.02126  [pdf, other

    cs.CL cs.CY

    Reporting the Unreported: Event Extraction for Analyzing the Local Representation of Hate Crimes

    Authors: Aida Mostafazadeh Davani, Leigh Yeh, Mohammad Atari, Brendan Kennedy, Gwenyth Portillo-Wightman, Elaine Gonzalez, Natalie Delong, Rhea Bhatia, Arineh Mirinjian, Xiang Ren, Morteza Dehghani

    Abstract: Official reports of hate crimes in the US are under-reported relative to the actual number of such incidents. Further, despite statistical approximations, there are no official reports from a large number of US cities regarding incidents of hate. Here, we first demonstrate that event extraction and multi-instance learning, applied to a corpus of local news articles, can be used to predict instance… ▽ More

    Submitted 4 September, 2019; originally announced September 2019.