Dynamic Top-k Estimation Consolidates Disagreement between Feature Attribution Methods

Kamp, Jonathan; Beinborn, Lisa; Fokkens, Antske

Computer Science > Computation and Language

arXiv:2310.05619 (cs)

[Submitted on 9 Oct 2023 (v1), last revised 3 Nov 2023 (this version, v2)]

Title:Dynamic Top-k Estimation Consolidates Disagreement between Feature Attribution Methods

Authors:Jonathan Kamp, Lisa Beinborn, Antske Fokkens

View PDF

Abstract:Feature attribution scores are used for explaining the prediction of a text classifier to users by highlighting a k number of tokens. In this work, we propose a way to determine the number of optimal k tokens that should be displayed from sequential properties of the attribution scores. Our approach is dynamic across sentences, method-agnostic, and deals with sentence length bias. We compare agreement between multiple methods and humans on an NLI task, using fixed k and dynamic k. We find that perturbation-based methods and Vanilla Gradient exhibit highest agreement on most method--method and method--human agreement metrics with a static k. Their advantage over other methods disappears with dynamic ks which mainly improve Integrated Gradient and GradientXInput. To our knowledge, this is the first evidence that sequential properties of attribution scores are informative for consolidating attribution signals for human interpretation.

Comments:	Short paper accepted to EMNLP 2023 main conference. Please cite the EMNLP version when available
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2310.05619 [cs.CL]
	(or arXiv:2310.05619v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2310.05619

Submission history

From: Jonathan Kamp [view email]
[v1] Mon, 9 Oct 2023 11:19:33 UTC (8,917 KB)
[v2] Fri, 3 Nov 2023 12:11:17 UTC (8,944 KB)

Computer Science > Computation and Language

Title:Dynamic Top-k Estimation Consolidates Disagreement between Feature Attribution Methods

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Dynamic Top-k Estimation Consolidates Disagreement between Feature Attribution Methods

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators