Gradient-free Post-hoc Explainability Using Distillation Aided Learnable Approach

Bhattacharya, Debarpan; Poorjam, Amir H.; Mittal, Deepak; Ganapathy, Sriram

Computer Science > Artificial Intelligence

arXiv:2409.11123 (cs)

[Submitted on 17 Sep 2024]

Title:Gradient-free Post-hoc Explainability Using Distillation Aided Learnable Approach

Authors:Debarpan Bhattacharya, Amir H. Poorjam, Deepak Mittal, Sriram Ganapathy

View PDF HTML (experimental)

Abstract:The recent advancements in artificial intelligence (AI), with the release of several large models having only query access, make a strong case for explainability of deep models in a post-hoc gradient free manner. In this paper, we propose a framework, named distillation aided explainability (DAX), that attempts to generate a saliency-based explanation in a model agnostic gradient free application. The DAX approach poses the problem of explanation in a learnable setting with a mask generation network and a distillation network. The mask generation network learns to generate the multiplier mask that finds the salient regions of the input, while the student distillation network aims to approximate the local behavior of the black-box model. We propose a joint optimization of the two networks in the DAX framework using the locally perturbed input samples, with the targets derived from input-output access to the black-box model. We extensively evaluate DAX across different modalities (image and audio), in a classification setting, using a diverse set of evaluations (intersection over union with ground truth, deletion based and subjective human evaluation based measures) and benchmark it with respect to $9$ different methods. In these evaluations, the DAX significantly outperforms the existing approaches on all modalities and evaluation metrics.

Comments:	12 pages, 10 figures, Accepted in IEEE Journal of Selected Topics in Signal Processing (JSTSP), 2024
Subjects:	Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
Cite as:	arXiv:2409.11123 [cs.AI]
	(or arXiv:2409.11123v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2409.11123

Submission history

From: Debarpan Bhattacharya [view email]
[v1] Tue, 17 Sep 2024 12:21:11 UTC (36,605 KB)

Computer Science > Artificial Intelligence

Title:Gradient-free Post-hoc Explainability Using Distillation Aided Learnable Approach

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Gradient-free Post-hoc Explainability Using Distillation Aided Learnable Approach

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators