What Can We Learn from Unlearnable Datasets?

Sandoval-Segura, Pedro; Singla, Vasu; Geiping, Jonas; Goldblum, Micah; Goldstein, Tom

Computer Science > Machine Learning

arXiv:2305.19254 (cs)

[Submitted on 30 May 2023 (v1), last revised 7 Nov 2023 (this version, v3)]

Title:What Can We Learn from Unlearnable Datasets?

Authors:Pedro Sandoval-Segura, Vasu Singla, Jonas Geiping, Micah Goldblum, Tom Goldstein

View PDF

Abstract:In an era of widespread web scraping, unlearnable dataset methods have the potential to protect data privacy by preventing deep neural networks from generalizing. But in addition to a number of practical limitations that make their use unlikely, we make a number of findings that call into question their ability to safeguard data. First, it is widely believed that neural networks trained on unlearnable datasets only learn shortcuts, simpler rules that are not useful for generalization. In contrast, we find that networks actually can learn useful features that can be reweighed for high test performance, suggesting that image protection is not assured. Unlearnable datasets are also believed to induce learning shortcuts through linear separability of added perturbations. We provide a counterexample, demonstrating that linear separability of perturbations is not a necessary condition. To emphasize why linearly separable perturbations should not be relied upon, we propose an orthogonal projection attack which allows learning from unlearnable datasets published in ICML 2021 and ICLR 2023. Our proposed attack is significantly less complex than recently proposed techniques.

Comments:	Accepted to NeurIPS 2023. Code available at this https URL
Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR)
Cite as:	arXiv:2305.19254 [cs.LG]
	(or arXiv:2305.19254v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2305.19254

Submission history

From: Pedro Sandoval-Segura [view email]
[v1] Tue, 30 May 2023 17:41:35 UTC (11,179 KB)
[v2] Tue, 24 Oct 2023 18:34:38 UTC (11,518 KB)
[v3] Tue, 7 Nov 2023 21:52:05 UTC (11,521 KB)

Computer Science > Machine Learning

Title:What Can We Learn from Unlearnable Datasets?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:What Can We Learn from Unlearnable Datasets?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators