Fool SHAP with Stealthily Biased Sampling

Laberge, Gabriel; Aïvodji, Ulrich; Hara, Satoshi; Marchand., Mario; Khomh, Foutse

Computer Science > Machine Learning

arXiv:2205.15419 (cs)

[Submitted on 30 May 2022 (v1), last revised 3 Mar 2023 (this version, v3)]

Title:Fool SHAP with Stealthily Biased Sampling

Authors:Gabriel Laberge, Ulrich Aïvodji, Satoshi Hara, Mario Marchand., Foutse Khomh

View PDF

Abstract:SHAP explanations aim at identifying which features contribute the most to the difference in model prediction at a specific input versus a background distribution. Recent studies have shown that they can be manipulated by malicious adversaries to produce arbitrary desired explanations. However, existing attacks focus solely on altering the black-box model itself. In this paper, we propose a complementary family of attacks that leave the model intact and manipulate SHAP explanations using stealthily biased sampling of the data points used to approximate expectations w.r.t the background distribution. In the context of fairness audit, we show that our attack can reduce the importance of a sensitive feature when explaining the difference in outcomes between groups while remaining undetected. More precisely, experiments performed on real-world datasets showed that our attack could yield up to a 90\% relative decrease in amplitude of the sensitive feature attribution. These results highlight the manipulability of SHAP explanations and encourage auditors to treat them with skepticism.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2205.15419 [cs.LG]
	(or arXiv:2205.15419v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2205.15419

Submission history

From: Gabriel Laberge [view email]
[v1] Mon, 30 May 2022 20:33:46 UTC (795 KB)
[v2] Thu, 29 Sep 2022 14:39:33 UTC (3,095 KB)
[v3] Fri, 3 Mar 2023 15:10:57 UTC (18,885 KB)

Computer Science > Machine Learning

Title:Fool SHAP with Stealthily Biased Sampling

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Fool SHAP with Stealthily Biased Sampling

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators