-
CulturalFrames: Assessing Cultural Expectation Alignment in Text-to-Image Models and Evaluation Metrics
Authors:
Shravan Nayak,
Mehar Bhatia,
Xiaofeng Zhang,
Verena Rieser,
Lisa Anne Hendricks,
Sjoerd van Steenkiste,
Yash Goyal,
Karolina Stańczak,
Aishwarya Agrawal
Abstract:
The increasing ubiquity of text-to-image (T2I) models as tools for visual content generation raises concerns about their ability to accurately represent diverse cultural contexts. In this work, we present the first study to systematically quantify the alignment of T2I models and evaluation metrics with respect to both explicit as well as implicit cultural expectations. To this end, we introduce Cu…
▽ More
The increasing ubiquity of text-to-image (T2I) models as tools for visual content generation raises concerns about their ability to accurately represent diverse cultural contexts. In this work, we present the first study to systematically quantify the alignment of T2I models and evaluation metrics with respect to both explicit as well as implicit cultural expectations. To this end, we introduce CulturalFrames, a novel benchmark designed for rigorous human evaluation of cultural representation in visual generations. Spanning 10 countries and 5 socio-cultural domains, CulturalFrames comprises 983 prompts, 3637 corresponding images generated by 4 state-of-the-art T2I models, and over 10k detailed human annotations. We find that T2I models not only fail to meet the more challenging implicit expectations but also the less challenging explicit expectations. Across models and countries, cultural expectations are missed an average of 44% of the time. Among these failures, explicit expectations are missed at a surprisingly high average rate of 68%, while implicit expectation failures are also significant, averaging 49%. Furthermore, we demonstrate that existing T2I evaluation metrics correlate poorly with human judgments of cultural alignment, irrespective of their internal reasoning. Collectively, our findings expose critical gaps, providing actionable directions for developing more culturally informed T2I models and evaluation methodologies.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Authors:
Xing Han Lù,
Amirhossein Kazemnejad,
Nicholas Meade,
Arkil Patel,
Dongchan Shin,
Alejandra Zambrano,
Karolina Stańczak,
Peter Shaw,
Christopher J. Pal,
Siva Reddy
Abstract:
Web agents enable users to perform tasks on web browsers through natural language interaction. Evaluating web agents trajectories is an important problem, since it helps us determine whether the agent successfully completed the tasks. Rule-based methods are widely used for this purpose, but they are challenging to extend to new tasks and may not always recognize successful trajectories. We may ach…
▽ More
Web agents enable users to perform tasks on web browsers through natural language interaction. Evaluating web agents trajectories is an important problem, since it helps us determine whether the agent successfully completed the tasks. Rule-based methods are widely used for this purpose, but they are challenging to extend to new tasks and may not always recognize successful trajectories. We may achieve higher accuracy through human evaluation, but the process would be substantially slower and more expensive. Automatic evaluations with LLMs may avoid the challenges of designing new rules and manually annotating trajectories, enabling faster and cost-effective evaluation. However, it is unclear how effective they are at evaluating web agents. To this end, we propose AgentRewardBench, the first benchmark to assess the effectiveness of LLM judges for evaluating web agents. AgentRewardBench contains 1302 trajectories across 5 benchmarks and 4 LLMs. Each trajectory in AgentRewardBench is reviewed by an expert, who answers questions pertaining to the success, side effects, and repetitiveness of the agent. Using our benchmark, we evaluate 12 LLM judges and find that no single LLM excels across all benchmarks. We also find that the rule-based evaluation used by common benchmarks tends to underreport the success rate of web agents, highlighting a key weakness of rule-based evaluation and the need to develop more flexible automatic evaluations. We release the benchmark at: https://agent-reward-bench.github.io
△ Less
Submitted 11 April, 2025;
originally announced April 2025.
-
DeepSeek-R1 Thoughtology: Let's think about LLM Reasoning
Authors:
Sara Vera Marjanović,
Arkil Patel,
Vaibhav Adlakha,
Milad Aghajohari,
Parishad BehnamGhader,
Mehar Bhatia,
Aditi Khandelwal,
Austin Kraft,
Benno Krojer,
Xing Han Lù,
Nicholas Meade,
Dongchan Shin,
Amirhossein Kazemnejad,
Gaurav Kamath,
Marius Mosbach,
Karolina Stańczak,
Siva Reddy
Abstract:
Large Reasoning Models like DeepSeek-R1 mark a fundamental shift in how LLMs approach complex problems. Instead of directly producing an answer for a given input, DeepSeek-R1 creates detailed multi-step reasoning chains, seemingly "thinking" about a problem before providing an answer. This reasoning process is publicly available to the user, creating endless opportunities for studying the reasonin…
▽ More
Large Reasoning Models like DeepSeek-R1 mark a fundamental shift in how LLMs approach complex problems. Instead of directly producing an answer for a given input, DeepSeek-R1 creates detailed multi-step reasoning chains, seemingly "thinking" about a problem before providing an answer. This reasoning process is publicly available to the user, creating endless opportunities for studying the reasoning behaviour of the model and opening up the field of Thoughtology. Starting from a taxonomy of DeepSeek-R1's basic building blocks of reasoning, our analyses on DeepSeek-R1 investigate the impact and controllability of thought length, management of long or confusing contexts, cultural and safety concerns, and the status of DeepSeek-R1 vis-à-vis cognitive phenomena, such as human-like language processing and world modelling. Our findings paint a nuanced picture. Notably, we show DeepSeek-R1 has a 'sweet spot' of reasoning, where extra inference time can impair model performance. Furthermore, we find a tendency for DeepSeek-R1 to persistently ruminate on previously explored problem formulations, obstructing further exploration. We also note strong safety vulnerabilities of DeepSeek-R1 compared to its non-reasoning counterpart, which can also compromise safety-aligned LLMs.
△ Less
Submitted 12 May, 2025; v1 submitted 1 April, 2025;
originally announced April 2025.
-
SafeArena: Evaluating the Safety of Autonomous Web Agents
Authors:
Ada Defne Tur,
Nicholas Meade,
Xing Han Lù,
Alejandra Zambrano,
Arkil Patel,
Esin Durmus,
Spandana Gella,
Karolina Stańczak,
Siva Reddy
Abstract:
LLM-based agents are becoming increasingly proficient at solving web-based tasks. With this capability comes a greater risk of misuse for malicious purposes, such as posting misinformation in an online forum or selling illicit substances on a website. To evaluate these risks, we propose SafeArena, the first benchmark to focus on the deliberate misuse of web agents. SafeArena comprises 250 safe and…
▽ More
LLM-based agents are becoming increasingly proficient at solving web-based tasks. With this capability comes a greater risk of misuse for malicious purposes, such as posting misinformation in an online forum or selling illicit substances on a website. To evaluate these risks, we propose SafeArena, the first benchmark to focus on the deliberate misuse of web agents. SafeArena comprises 250 safe and 250 harmful tasks across four websites. We classify the harmful tasks into five harm categories -- misinformation, illegal activity, harassment, cybercrime, and social bias, designed to assess realistic misuses of web agents. We evaluate leading LLM-based web agents, including GPT-4o, Claude-3.5 Sonnet, Qwen-2-VL 72B, and Llama-3.2 90B, on our benchmark. To systematically assess their susceptibility to harmful tasks, we introduce the Agent Risk Assessment framework that categorizes agent behavior across four risk levels. We find agents are surprisingly compliant with malicious requests, with GPT-4o and Qwen-2 completing 34.7% and 27.3% of harmful requests, respectively. Our findings highlight the urgent need for safety alignment procedures for web agents. Our benchmark is available here: https://safearena.github.io
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
Societal Alignment Frameworks Can Improve LLM Alignment
Authors:
Karolina Stańczak,
Nicholas Meade,
Mehar Bhatia,
Hattie Zhou,
Konstantin Böttinger,
Jeremy Barnes,
Jason Stanley,
Jessica Montgomery,
Richard Zemel,
Nicolas Papernot,
Nicolas Chapados,
Denis Therien,
Timothy P. Lillicrap,
Ana Marasović,
Sylvie Delacroix,
Gillian K. Hadfield,
Siva Reddy
Abstract:
Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared values - a process coined alignment. However, aligning LLMs remains challenging due to the inherent disconnect between the complexity of human values and the narrow nature of the technological approaches designed to address them. Current alignment methods often lead…
▽ More
Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared values - a process coined alignment. However, aligning LLMs remains challenging due to the inherent disconnect between the complexity of human values and the narrow nature of the technological approaches designed to address them. Current alignment methods often lead to misspecified objectives, reflecting the broader issue of incomplete contracts, the impracticality of specifying a contract between a model developer, and the model that accounts for every scenario in LLM alignment. In this paper, we argue that improving LLM alignment requires incorporating insights from societal alignment frameworks, including social, economic, and contractual alignment, and discuss potential solutions drawn from these domains. Given the role of uncertainty within societal alignment frameworks, we then investigate how it manifests in LLM alignment. We end our discussion by offering an alternative view on LLM alignment, framing the underspecified nature of its objectives as an opportunity rather than perfect their specification. Beyond technical improvements in LLM alignment, we discuss the need for participatory alignment interface designs.
△ Less
Submitted 27 February, 2025;
originally announced March 2025.
-
Benchmarking Vision Language Models for Cultural Understanding
Authors:
Shravan Nayak,
Kanishk Jain,
Rabiul Awal,
Siva Reddy,
Sjoerd van Steenkiste,
Lisa Anne Hendricks,
Karolina Stańczak,
Aishwarya Agrawal
Abstract:
Foundation models and vision-language pre-training have notably advanced Vision Language Models (VLMs), enabling multimodal processing of visual and linguistic data. However, their performance has been typically assessed on general scene understanding - recognizing objects, attributes, and actions - rather than cultural comprehension. This study introduces CulturalVQA, a visual question-answering…
▽ More
Foundation models and vision-language pre-training have notably advanced Vision Language Models (VLMs), enabling multimodal processing of visual and linguistic data. However, their performance has been typically assessed on general scene understanding - recognizing objects, attributes, and actions - rather than cultural comprehension. This study introduces CulturalVQA, a visual question-answering benchmark aimed at assessing VLM's geo-diverse cultural understanding. We curate a collection of 2,378 image-question pairs with 1-5 answers per question representing cultures from 11 countries across 5 continents. The questions probe understanding of various facets of culture such as clothing, food, drinks, rituals, and traditions. Benchmarking VLMs on CulturalVQA, including GPT-4V and Gemini, reveals disparity in their level of cultural understanding across regions, with strong cultural understanding capabilities for North America while significantly lower performance for Africa. We observe disparity in their performance across cultural facets too, with clothing, rituals, and traditions seeing higher performances than food and drink. These disparities help us identify areas where VLMs lack cultural understanding and demonstrate the potential of CulturalVQA as a comprehensive evaluation set for gauging VLM progress in understanding diverse cultures.
△ Less
Submitted 14 October, 2024; v1 submitted 15 July, 2024;
originally announced July 2024.
-
A Multilingual Perspective on Probing Gender Bias
Authors:
Karolina Stańczak
Abstract:
Gender bias represents a form of systematic negative treatment that targets individuals based on their gender. This discrimination can range from subtle sexist remarks and gendered stereotypes to outright hate speech. Prior research has revealed that ignoring online abuse not only affects the individuals targeted but also has broader societal implications. These consequences extend to the discoura…
▽ More
Gender bias represents a form of systematic negative treatment that targets individuals based on their gender. This discrimination can range from subtle sexist remarks and gendered stereotypes to outright hate speech. Prior research has revealed that ignoring online abuse not only affects the individuals targeted but also has broader societal implications. These consequences extend to the discouragement of women's engagement and visibility within public spheres, thereby reinforcing gender inequality. This thesis investigates the nuances of how gender bias is expressed through language and within language technologies. Significantly, this thesis expands research on gender bias to multilingual contexts, emphasising the importance of a multilingual and multicultural perspective in understanding societal biases. In this thesis, I adopt an interdisciplinary approach, bridging natural language processing with other disciplines such as political science and history, to probe gender bias in natural language and language models.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
The Causal Influence of Grammatical Gender on Distributional Semantics
Authors:
Karolina Stańczak,
Kevin Du,
Adina Williams,
Isabelle Augenstein,
Ryan Cotterell
Abstract:
How much meaning influences gender assignment across languages is an active area of research in linguistics and cognitive science. We can view current approaches as aiming to determine where gender assignment falls on a spectrum, from being fully arbitrarily determined to being largely semantically determined. For the latter case, there is a formulation of the neo-Whorfian hypothesis, which claims…
▽ More
How much meaning influences gender assignment across languages is an active area of research in linguistics and cognitive science. We can view current approaches as aiming to determine where gender assignment falls on a spectrum, from being fully arbitrarily determined to being largely semantically determined. For the latter case, there is a formulation of the neo-Whorfian hypothesis, which claims that even inanimate noun gender influences how people conceive of and talk about objects (using the choice of adjective used to modify inanimate nouns as a proxy for meaning). We offer a novel, causal graphical model that jointly represents the interactions between a noun's grammatical gender, its meaning, and adjective choice. In accordance with past results, we find a significant relationship between the gender of nouns and the adjectives that modify them. However, when we control for the meaning of the noun, the relationship between grammatical gender and adjective choice is near zero and insignificant.
△ Less
Submitted 22 October, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.
-
Invisible Women in Digital Diplomacy: A Multidimensional Framework for Online Gender Bias Against Women Ambassadors Worldwide
Authors:
Yevgeniy Golovchenko,
Karolina Stańczak,
Rebecca Adler-Nissen,
Patrice Wangen,
Isabelle Augenstein
Abstract:
Despite mounting evidence that women in foreign policy often bear the brunt of online hostility, the extent of online gender bias against diplomats remains unexplored. This paper offers the first global analysis of the treatment of women diplomats on social media. Introducing a multidimensional and multilingual methodology for studying online gender bias, it focuses on three critical elements: gen…
▽ More
Despite mounting evidence that women in foreign policy often bear the brunt of online hostility, the extent of online gender bias against diplomats remains unexplored. This paper offers the first global analysis of the treatment of women diplomats on social media. Introducing a multidimensional and multilingual methodology for studying online gender bias, it focuses on three critical elements: gendered language, negativity in tweets directed at diplomats, and the visibility of women diplomats. Our unique dataset encompasses ambassadors from 164 countries, their tweets, and the direct responses to these tweets in 65 different languages. Using automated content and sentiment analysis, our findings reveal a crucial gender bias. The language in responses to diplomatic tweets is only mildly gendered and largely pertains to international affairs and, generally, women ambassadors do not receive more negative reactions to their tweets than men, yet the pronounced discrepancy in online visibility stands out as a significant form of gender bias. Women receive a staggering 66.4% fewer retweets than men. By unraveling the invisibility that obscures women diplomats on social media, we hope to spark further research on online bias in international politics.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
Social Bias Probing: Fairness Benchmarking for Language Models
Authors:
Marta Marchiori Manerba,
Karolina Stańczak,
Riccardo Guidotti,
Isabelle Augenstein
Abstract:
While the impact of social biases in language models has been recognized, prior methods for bias evaluation have been limited to binary association tests on small datasets, limiting our understanding of bias complexities. This paper proposes a novel framework for probing language models for social biases by assessing disparate treatment, which involves treating individuals differently according to…
▽ More
While the impact of social biases in language models has been recognized, prior methods for bias evaluation have been limited to binary association tests on small datasets, limiting our understanding of bias complexities. This paper proposes a novel framework for probing language models for social biases by assessing disparate treatment, which involves treating individuals differently according to their affiliation with a sensitive demographic group. We curate SoFa, a large-scale benchmark designed to address the limitations of existing fairness collections. SoFa expands the analysis beyond the binary comparison of stereotypical versus anti-stereotypical identities to include a diverse range of identities and stereotypes. Comparing our methodology with existing benchmarks, we reveal that biases within language models are more nuanced than acknowledged, indicating a broader scope of encoded biases than previously recognized. Benchmarking LMs on SoFa, we expose how identities expressing different religions lead to the most pronounced disparate treatments across all models. Finally, our findings indicate that real-life adversities faced by various groups such as women and people with disabilities are mirrored in the behavior of these models.
△ Less
Submitted 7 October, 2024; v1 submitted 15 November, 2023;
originally announced November 2023.
-
Measuring Intersectional Biases in Historical Documents
Authors:
Nadav Borenstein,
Karolina Stańczak,
Thea Rolskov,
Natália da Silva Perez,
Natacha Klein Käfer,
Isabelle Augenstein
Abstract:
Data-driven analyses of biases in historical texts can help illuminate the origin and development of biases prevailing in modern society.
However, digitised historical documents pose a challenge for NLP practitioners as these corpora suffer from errors introduced by optical character recognition (OCR) and are written in an archaic language. In this paper, we investigate the continuities and tran…
▽ More
Data-driven analyses of biases in historical texts can help illuminate the origin and development of biases prevailing in modern society.
However, digitised historical documents pose a challenge for NLP practitioners as these corpora suffer from errors introduced by optical character recognition (OCR) and are written in an archaic language. In this paper, we investigate the continuities and transformations of bias in historical newspapers published in the Caribbean during the colonial era (18th to 19th centuries). Our analyses are performed along the axes of gender, race, and their intersection. We examine these biases by conducting a temporal study in which we measure the development of lexical associations using distributional semantics models and word embeddings. Further, we evaluate the effectiveness of techniques designed to process OCR-generated data and assess their stability when trained on and applied to the noisy historical newspapers. We find that there is a trade-off between the stability of the word embeddings and their compatibility with the historical dataset. We provide evidence that gender and racial biases are interdependent, and their intersection triggers distinct effects. These findings align with the theory of intersectionality, which stresses that biases affecting people with multiple marginalised identities compound to more than the sum of their constituents.
△ Less
Submitted 21 May, 2023;
originally announced May 2023.
-
Measuring Gender Bias in West Slavic Language Models
Authors:
Sandra Martinková,
Karolina Stańczak,
Isabelle Augenstein
Abstract:
Pre-trained language models have been known to perpetuate biases from the underlying datasets to downstream tasks. However, these findings are predominantly based on monolingual language models for English, whereas there are few investigative studies of biases encoded in language models for languages beyond English. In this paper, we fill this gap by analysing gender bias in West Slavic language m…
▽ More
Pre-trained language models have been known to perpetuate biases from the underlying datasets to downstream tasks. However, these findings are predominantly based on monolingual language models for English, whereas there are few investigative studies of biases encoded in language models for languages beyond English. In this paper, we fill this gap by analysing gender bias in West Slavic language models. We introduce the first template-based dataset in Czech, Polish, and Slovak for measuring gender bias towards male, female and non-binary subjects. We complete the sentences using both mono- and multilingual language models and assess their suitability for the masked language modelling objective. Next, we measure gender bias encoded in West Slavic language models by quantifying the toxicity and genderness of the generated words. We find that these language models produce hurtful completions that depend on the subject's gender. Perhaps surprisingly, Czech, Slovak, and Polish language models produce more hurtful completions with men as subjects, which, upon inspection, we find is due to completions being related to violence, death, and sickness.
△ Less
Submitted 25 May, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models
Authors:
Karolina Stańczak,
Edoardo Ponti,
Lucas Torroba Hennigen,
Ryan Cotterell,
Isabelle Augenstein
Abstract:
The success of multilingual pre-trained models is underpinned by their ability to learn representations shared by multiple languages even in absence of any explicit supervision. However, it remains unclear how these models learn to generalise across languages. In this work, we conjecture that multilingual pre-trained models can derive language-universal abstractions about grammar. In particular, w…
▽ More
The success of multilingual pre-trained models is underpinned by their ability to learn representations shared by multiple languages even in absence of any explicit supervision. However, it remains unclear how these models learn to generalise across languages. In this work, we conjecture that multilingual pre-trained models can derive language-universal abstractions about grammar. In particular, we investigate whether morphosyntactic information is encoded in the same subset of neurons in different languages. We conduct the first large-scale empirical study over 43 languages and 14 morphosyntactic categories with a state-of-the-art neuron-level probe. Our findings show that the cross-lingual overlap between neurons is significant, but its extent may vary across categories and depends on language proximity and pre-training data size.
△ Less
Submitted 8 May, 2022; v1 submitted 4 May, 2022;
originally announced May 2022.
-
A Latent-Variable Model for Intrinsic Probing
Authors:
Karolina Stańczak,
Lucas Torroba Hennigen,
Adina Williams,
Ryan Cotterell,
Isabelle Augenstein
Abstract:
The success of pre-trained contextualized representations has prompted researchers to analyze them for the presence of linguistic information. Indeed, it is natural to assume that these pre-trained representations do encode some level of linguistic knowledge as they have brought about large empirical improvements on a wide variety of NLP tasks, which suggests they are learning true linguistic gene…
▽ More
The success of pre-trained contextualized representations has prompted researchers to analyze them for the presence of linguistic information. Indeed, it is natural to assume that these pre-trained representations do encode some level of linguistic knowledge as they have brought about large empirical improvements on a wide variety of NLP tasks, which suggests they are learning true linguistic generalization. In this work, we focus on intrinsic probing, an analysis technique where the goal is not only to identify whether a representation encodes a linguistic attribute but also to pinpoint where this attribute is encoded. We propose a novel latent-variable formulation for constructing intrinsic probes and derive a tractable variational approximation to the log-likelihood. Our results show that our model is versatile and yields tighter mutual information estimates than two intrinsic probes previously proposed in the literature. Finally, we find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
△ Less
Submitted 11 July, 2024; v1 submitted 20 January, 2022;
originally announced January 2022.
-
A Survey on Gender Bias in Natural Language Processing
Authors:
Karolina Stanczak,
Isabelle Augenstein
Abstract:
Language can be used as a means of reproducing and enforcing harmful stereotypes and biases and has been analysed as such in numerous research. In this paper, we present a survey of 304 papers on gender bias in natural language processing. We analyse definitions of gender and its categories within social sciences and connect them to formal definitions of gender bias in NLP research. We survey lexi…
▽ More
Language can be used as a means of reproducing and enforcing harmful stereotypes and biases and has been analysed as such in numerous research. In this paper, we present a survey of 304 papers on gender bias in natural language processing. We analyse definitions of gender and its categories within social sciences and connect them to formal definitions of gender bias in NLP research. We survey lexica and datasets applied in research on gender bias and then compare and contrast approaches to detecting and mitigating gender bias. We find that research on gender bias suffers from four core limitations. 1) Most research treats gender as a binary variable neglecting its fluidity and continuity. 2) Most of the work has been conducted in monolingual setups for English or other high-resource languages. 3) Despite a myriad of papers on gender bias in NLP methods, we find that most of the newly developed algorithms do not test their models for bias and disregard possible ethical considerations of their work. 4) Finally, methodologies developed in this line of research are fundamentally flawed covering very limited definitions of gender bias and lacking evaluation baselines and pipelines. We suggest recommendations towards overcoming these limitations as a guide for future research.
△ Less
Submitted 28 December, 2021;
originally announced December 2021.
-
Quantifying Gender Biases Towards Politicians on Reddit
Authors:
Sara Marjanovic,
Karolina Stańczak,
Isabelle Augenstein
Abstract:
Despite attempts to increase gender parity in politics, global efforts have struggled to ensure equal female representation. This is likely tied to implicit gender biases against women in authority. In this work, we present a comprehensive study of gender biases that appear in online political discussion. To this end, we collect 10 million comments on Reddit in conversations about male and female…
▽ More
Despite attempts to increase gender parity in politics, global efforts have struggled to ensure equal female representation. This is likely tied to implicit gender biases against women in authority. In this work, we present a comprehensive study of gender biases that appear in online political discussion. To this end, we collect 10 million comments on Reddit in conversations about male and female politicians, which enables an exhaustive study of automatic gender bias detection. We address not only misogynistic language, but also other manifestations of bias, like benevolent sexism in the form of seemingly positive sentiment and dominance attributed to female politicians, or differences in descriptor attribution. Finally, we conduct a multi-faceted study of gender bias towards politicians investigating both linguistic and extra-linguistic cues. We assess 5 different types of gender bias, evaluating coverage, combinatorial, nominal, sentimental, and lexical biases extant in social media language and discourse. Overall, we find that, contrary to previous research, coverage and sentiment biases suggest equal public interest in female politicians. Rather than overt hostile or benevolent sexism, the results of the nominal and lexical analyses suggest this interest is not as professional or respectful as that expressed about male politicians. Female politicians are often named by their first names and are described in relation to their body, clothing, or family; this is a treatment that is not similarly extended to men. On the now banned far-right subreddits, this disparity is greatest, though differences in gender biases still appear in the right and left-leaning subreddits. We release the curated dataset to the public for future studies.
△ Less
Submitted 7 September, 2022; v1 submitted 22 December, 2021;
originally announced December 2021.
-
Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models
Authors:
Karolina Stańczak,
Sagnik Ray Choudhury,
Tiago Pimentel,
Ryan Cotterell,
Isabelle Augenstein
Abstract:
Recent research has demonstrated that large pre-trained language models reflect societal biases expressed in natural language. The present paper introduces a simple method for probing language models to conduct a multilingual study of gender bias towards politicians. We quantify the usage of adjectives and verbs generated by language models surrounding the names of politicians as a function of the…
▽ More
Recent research has demonstrated that large pre-trained language models reflect societal biases expressed in natural language. The present paper introduces a simple method for probing language models to conduct a multilingual study of gender bias towards politicians. We quantify the usage of adjectives and verbs generated by language models surrounding the names of politicians as a function of their gender. To this end, we curate a dataset of 250k politicians worldwide, including their names and gender. Our study is conducted in seven languages across six different language modeling architectures. The results demonstrate that pre-trained language models' stance towards politicians varies strongly across analyzed languages. We find that while some words such as dead, and designated are associated with both male and female politicians, a few specific words such as beautiful and divorced are predominantly associated with female politicians. Finally, and contrary to previous findings, our study suggests that larger language models do not tend to be significantly more gender-biased than smaller ones.
△ Less
Submitted 9 November, 2023; v1 submitted 15 April, 2021;
originally announced April 2021.