Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

Stańczak, Karolina; Choudhury, Sagnik Ray; Pimentel, Tiago; Cotterell, Ryan; Augenstein, Isabelle

Computer Science > Computation and Language

arXiv:2104.07505 (cs)

[Submitted on 15 Apr 2021 (v1), last revised 9 Nov 2023 (this version, v2)]

Title:Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

Authors:Karolina Stańczak, Sagnik Ray Choudhury, Tiago Pimentel, Ryan Cotterell, Isabelle Augenstein

View PDF

Abstract:Recent research has demonstrated that large pre-trained language models reflect societal biases expressed in natural language. The present paper introduces a simple method for probing language models to conduct a multilingual study of gender bias towards politicians. We quantify the usage of adjectives and verbs generated by language models surrounding the names of politicians as a function of their gender. To this end, we curate a dataset of 250k politicians worldwide, including their names and gender. Our study is conducted in seven languages across six different language modeling architectures. The results demonstrate that pre-trained language models' stance towards politicians varies strongly across analyzed languages. We find that while some words such as dead, and designated are associated with both male and female politicians, a few specific words such as beautiful and divorced are predominantly associated with female politicians. Finally, and contrary to previous findings, our study suggests that larger language models do not tend to be significantly more gender-biased than smaller ones.

Subjects:	Computation and Language (cs.CL); Machine Learning (stat.ML)
Cite as:	arXiv:2104.07505 [cs.CL]
	(or arXiv:2104.07505v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2104.07505

Submission history

From: Karolina Stańczak [view email]
[v1] Thu, 15 Apr 2021 15:03:26 UTC (7,465 KB)
[v2] Thu, 9 Nov 2023 16:15:40 UTC (7,788 KB)

Computer Science > Computation and Language

Title:Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators