On the Interaction of Belief Bias and Explanations

Gonzalez, Ana Valeria; Rogers, Anna; Søgaard, Anders

Computer Science > Computation and Language

arXiv:2106.15355 (cs)

[Submitted on 29 Jun 2021]

Title:On the Interaction of Belief Bias and Explanations

Authors:Ana Valeria Gonzalez, Anna Rogers, Anders Søgaard

View PDF

Abstract:A myriad of explainability methods have been proposed in recent years, but there is little consensus on how to evaluate them. While automatic metrics allow for quick benchmarking, it isn't clear how such metrics reflect human interaction with explanations. Human evaluation is of paramount importance, but previous protocols fail to account for belief biases affecting human performance, which may lead to misleading conclusions. We provide an overview of belief bias, its role in human evaluation, and ideas for NLP practitioners on how to account for it. For two experimental paradigms, we present a case study of gradient-based explainability introducing simple ways to account for humans' prior beliefs: models of varying quality and adversarial examples. We show that conclusions about the highest performing methods change when introducing such controls, pointing to the importance of accounting for belief bias in evaluation.

Comments:	accepted at findings of ACL 2021
Subjects:	Computation and Language (cs.CL); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2106.15355 [cs.CL]
	(or arXiv:2106.15355v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2106.15355

Submission history

From: Ana Valeria Gonzalez [view email]
[v1] Tue, 29 Jun 2021 12:49:42 UTC (8,704 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-06

Change to browse by:

cs
cs.HC

References & Citations

DBLP - CS Bibliography

listing | bibtex

Anna Rogers
Anders Søgaard

export BibTeX citation

Computer Science > Computation and Language

Title:On the Interaction of Belief Bias and Explanations

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:On the Interaction of Belief Bias and Explanations

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators