Can Instruction Fine-Tuned Language Models Identify Social Bias through Prompting?

Dige, Omkar; Tian, Jacob-Junqi; Emerson, David; Khattak, Faiza Khan

Computer Science > Computation and Language

arXiv:2307.10472 (cs)

[Submitted on 19 Jul 2023]

Title:Can Instruction Fine-Tuned Language Models Identify Social Bias through Prompting?

Authors:Omkar Dige, Jacob-Junqi Tian, David Emerson, Faiza Khan Khattak

View PDF

Abstract:As the breadth and depth of language model applications continue to expand rapidly, it is increasingly important to build efficient frameworks for measuring and mitigating the learned or inherited social biases of these models. In this paper, we present our work on evaluating instruction fine-tuned language models' ability to identify bias through zero-shot prompting, including Chain-of-Thought (CoT) prompts. Across LLaMA and its two instruction fine-tuned versions, Alpaca 7B performs best on the bias identification task with an accuracy of 56.7%. We also demonstrate that scaling up LLM size and data diversity could lead to further performance gain. This is a work-in-progress presenting the first component of our bias mitigation framework. We will keep updating this work as we get more results.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Machine Learning (cs.LG)
Cite as:	arXiv:2307.10472 [cs.CL]
	(or arXiv:2307.10472v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2307.10472

Submission history

From: Faiza Khattak Dr. [view email]
[v1] Wed, 19 Jul 2023 22:03:40 UTC (19,987 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2023-07

Change to browse by:

cs
cs.AI
cs.CY
cs.LG

References & Citations

export BibTeX citation

Computer Science > Computation and Language

Title:Can Instruction Fine-Tuned Language Models Identify Social Bias through Prompting?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Can Instruction Fine-Tuned Language Models Identify Social Bias through Prompting?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators