Gibberish is All You Need for Membership Inference Detection in Contrastive Language-Audio Pretraining

Cheng, Ruoxi; Ding, Yizhong; Cao, Shuirong; Shao, Shitong; Wang, Zhiqiang

Computer Science > Sound

arXiv:2410.18371 (cs)

[Submitted on 24 Oct 2024 (v1), last revised 2 Nov 2024 (this version, v2)]

Title:Gibberish is All You Need for Membership Inference Detection in Contrastive Language-Audio Pretraining

Authors:Ruoxi Cheng, Yizhong Ding, Shuirong Cao, Shitong Shao, Zhiqiang Wang

View PDF HTML (experimental)

Abstract:Audio can disclose PII, particularly when combined with related text data. Therefore, it is essential to develop tools to detect privacy leakage in Contrastive Language-Audio Pretraining(CLAP). Existing MIAs need audio as input, risking exposure of voiceprint and requiring costly shadow models. We first propose PRMID, a membership inference detector based probability ranking given by CLAP, which does not require training shadow models but still requires both audio and text of the individual as input. To address these limitations, we then propose USMID, a textual unimodal speaker-level membership inference detector, querying the target model using only text data. We randomly generate textual gibberish that are clearly not in training dataset. Then we extract feature vectors from these texts using the CLAP model and train a set of anomaly detectors on them. During inference, the feature vector of each test text is input into the anomaly detector to determine if the speaker is in the training set (anomalous) or not (normal). If available, USMID can further enhance detection by integrating real audio of the tested speaker. Extensive experiments on various CLAP model architectures and datasets demonstrate that USMID outperforms baseline methods using only text data.

Subjects:	Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2410.18371 [cs.SD]
	(or arXiv:2410.18371v2 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2410.18371

Submission history

From: Rosy Cheng [view email]
[v1] Thu, 24 Oct 2024 02:26:57 UTC (1,648 KB)
[v2] Sat, 2 Nov 2024 10:00:52 UTC (2,352 KB)

Computer Science > Sound

Title:Gibberish is All You Need for Membership Inference Detection in Contrastive Language-Audio Pretraining

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Gibberish is All You Need for Membership Inference Detection in Contrastive Language-Audio Pretraining

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators