Semantic Membership Inference Attack against Large Language Models

Mozaffari, Hamid; Marathe, Virendra J.

Computer Science > Machine Learning

arXiv:2406.10218 (cs)

[Submitted on 14 Jun 2024]

Title:Semantic Membership Inference Attack against Large Language Models

Authors:Hamid Mozaffari, Virendra J. Marathe

View PDF HTML (experimental)

Abstract:Membership Inference Attacks (MIAs) determine whether a specific data point was included in the training set of a target model. In this paper, we introduce the Semantic Membership Inference Attack (SMIA), a novel approach that enhances MIA performance by leveraging the semantic content of inputs and their perturbations. SMIA trains a neural network to analyze the target model's behavior on perturbed inputs, effectively capturing variations in output probability distributions between members and non-members. We conduct comprehensive evaluations on the Pythia and GPT-Neo model families using the Wikipedia dataset. Our results show that SMIA significantly outperforms existing MIAs; for instance, SMIA achieves an AUC-ROC of 67.39% on Pythia-12B, compared to 58.90% by the second-best attack.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2406.10218 [cs.LG]
	(or arXiv:2406.10218v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2406.10218

Submission history

From: Hamid Mozaffari [view email]
[v1] Fri, 14 Jun 2024 17:53:50 UTC (754 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2024-06

Change to browse by:

References & Citations

export BibTeX citation

Computer Science > Machine Learning

Title:Semantic Membership Inference Attack against Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Semantic Membership Inference Attack against Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators