Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models

Shan, Shawn; Ding, Wenxin; Wenger, Emily; Zheng, Haitao; Zhao, Ben Y.

doi:10.1145/3548606.3560561

Computer Science > Cryptography and Security

arXiv:2205.10686 (cs)

[Submitted on 21 May 2022 (v1), last revised 16 Oct 2022 (this version, v2)]

Title:Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models

Authors:Shawn Shan, Wenxin Ding, Emily Wenger, Haitao Zheng, Ben Y. Zhao

View PDF

Abstract:Server breaches are an unfortunate reality on today's Internet. In the context of deep neural network (DNN) models, they are particularly harmful, because a leaked model gives an attacker "white-box" access to generate adversarial examples, a threat model that has no practical robust defenses. For practitioners who have invested years and millions into proprietary DNNs, e.g. medical imaging, this seems like an inevitable disaster looming on the horizon.
In this paper, we consider the problem of post-breach recovery for DNN models. We propose Neo, a new system that creates new versions of leaked models, alongside an inference time filter that detects and removes adversarial examples generated on previously leaked models. The classification surfaces of different model versions are slightly offset (by introducing hidden distributions), and Neo detects the overfitting of attacks to the leaked model used in its generation. We show that across a variety of tasks and attack methods, Neo is able to filter out attacks from leaked models with very high accuracy, and provides strong protection (7--10 recoveries) against attackers who repeatedly breach the server. Neo performs well against a variety of strong adaptive attacks, dropping slightly in # of breaches recoverable, and demonstrates potential as a complement to DNN defenses in the wild.

Subjects:	Cryptography and Security (cs.CR)
Cite as:	arXiv:2205.10686 [cs.CR]
	(or arXiv:2205.10686v2 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2205.10686
Journal reference:	2022 ACM Conference on Computer and Communications Security (CCS)
Related DOI:	https://doi.org/10.1145/3548606.3560561

Submission history

From: Shawn Shan [view email]
[v1] Sat, 21 May 2022 22:31:35 UTC (22,730 KB)
[v2] Sun, 16 Oct 2022 23:44:42 UTC (22,749 KB)

Computer Science > Cryptography and Security

Title:Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:Post-breach Recovery: Protection against White-box Adversarial Examples for Leaked DNN Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators