Audio Deepfake Attribution: An Initial Dataset and Investigation

Yan, Xinrui; Yi, Jiangyan; Tao, Jianhua; Chen, Jie

Computer Science > Sound

arXiv:2208.10489 (cs)

[Submitted on 21 Aug 2022 (v1), last revised 17 Nov 2024 (this version, v4)]

Title:Audio Deepfake Attribution: An Initial Dataset and Investigation

Authors:Xinrui Yan, Jiangyan Yi, Jianhua Tao, Jie Chen

View PDF HTML (experimental)

Abstract:The rapid progress of deep speech synthesis models has posed significant threats to society such as malicious manipulation of content. This has led to an increase in studies aimed at detecting so-called deepfake audio. However, existing works focus on the binary detection of real audio and fake audio. In real-world scenarios such as model copyright protection and digital evidence forensics, binary classification alone is insufficient. It is essential to identify the source of deepfake audio. Therefore, audio deepfake attribution has emerged as a new challenge. To this end, we designed the first deepfake audio dataset for the attribution of audio generation tools, called Audio Deepfake Attribution (ADA), and conducted a comprehensive investigation on system fingerprints. To address the challenges of attribution of continuously emerging unknown audio generation tools in the real world, we propose the Class-Representation Multi-Center Learning (CRML) method for open-set audio deepfake attribution (OSADA). CRML enhances the global directional variation of representations, ensuring the learning of discriminative representations with strong intra-class similarity and inter-class discrepancy among known classes. Finally, the strong class discrimination capability learned from known classes is extended to both known and unknown classes. Experimental results demonstrate that the CRML method effectively addresses open-set risks in real-world scenarios. The dataset is publicly available at: this https URL, and this https URL.

Comments:	13 pages, 5 figures. arXiv admin note: text overlap with arXiv:2208.10489v3
Subjects:	Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2208.10489 [cs.SD]
	(or arXiv:2208.10489v4 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2208.10489

Submission history

From: Xinrui Yan [view email]
[v1] Sun, 21 Aug 2022 05:15:40 UTC (5,527 KB)
[v2] Wed, 15 Feb 2023 06:45:50 UTC (3,303 KB)
[v3] Fri, 15 Sep 2023 07:19:46 UTC (3,810 KB)
[v4] Sun, 17 Nov 2024 09:12:11 UTC (3,251 KB)

Computer Science > Sound

Title:Audio Deepfake Attribution: An Initial Dataset and Investigation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Audio Deepfake Attribution: An Initial Dataset and Investigation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators