Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

Stańczak, Karolina; Choudhury, Sagnik Ray; Pimentel, Tiago; Cotterell, Ryan; Augenstein, Isabelle

Computer Science > Computation and Language

arXiv:2104.07505v1 (cs)

[Submitted on 15 Apr 2021 (this version), latest version 9 Nov 2023 (v2)]

Title:Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

Authors:Karolina Stańczak, Sagnik Ray Choudhury, Tiago Pimentel, Ryan Cotterell, Isabelle Augenstein

View PDF

Abstract:While the prevalence of large pre-trained language models has led to significant improvements in the performance of NLP systems, recent research has demonstrated that these models inherit societal biases extant in natural language. In this paper, we explore a simple method to probe pre-trained language models for gender bias, which we use to effect a multi-lingual study of gender bias towards politicians. We construct a dataset of 250k politicians from most countries in the world and quantify adjective and verb usage around those politicians' names as a function of their gender. We conduct our study in 7 languages across 6 different language modeling architectures. Our results demonstrate that stance towards politicians in pre-trained language models is highly dependent on the language used. Finally, contrary to previous findings, our study suggests that larger language models do not tend to be significantly more gender-biased than smaller ones.

Subjects:	Computation and Language (cs.CL); Machine Learning (stat.ML)
Cite as:	arXiv:2104.07505 [cs.CL]
	(or arXiv:2104.07505v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2104.07505

Submission history

From: Karolina Stanczak [view email]
[v1] Thu, 15 Apr 2021 15:03:26 UTC (7,465 KB)
[v2] Thu, 9 Nov 2023 16:15:40 UTC (7,788 KB)

Computer Science > Computation and Language

Title:Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators