Example-Based Framework for Perceptually Guided Audio Texture Generation

Kamath, Purnima; Gupta, Chitralekha; Wyse, Lonce; Nanayakkara, Suranga

doi:10.1109/TASLP.2024.3393741

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2308.11859 (eess)

[Submitted on 23 Aug 2023 (v1), last revised 14 Apr 2024 (this version, v2)]

Title:Example-Based Framework for Perceptually Guided Audio Texture Generation

Authors:Purnima Kamath, Chitralekha Gupta, Lonce Wyse, Suranga Nanayakkara

View PDF HTML (experimental)

Abstract:Controllable generation using StyleGANs is usually achieved by training the model using labeled data. For audio textures, however, there is currently a lack of large semantically labeled datasets. Therefore, to control generation, we develop a method for semantic control over an unconditionally trained StyleGAN in the absence of such labeled datasets. In this paper, we propose an example-based framework to determine guidance vectors for audio texture generation based on user-defined semantic attributes. Our approach leverages the semantically disentangled latent space of an unconditionally trained StyleGAN. By using a few synthetic examples to indicate the presence or absence of a semantic attribute, we infer the guidance vectors in the latent space of the StyleGAN to control that attribute during generation. Our results show that our framework can find user-defined and perceptually relevant guidance vectors for controllable generation for audio textures. Furthermore, we demonstrate an application of our framework to other tasks, such as selective semantic attribute transfer.

Comments:	Accepted for publication at IEEE Transactions on Audio, Speech and Language Processing
Subjects:	Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Sound (cs.SD)
Cite as:	arXiv:2308.11859 [eess.AS]
	(or arXiv:2308.11859v2 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2308.11859
Related DOI:	https://doi.org/10.1109/TASLP.2024.3393741

Submission history

From: Purnima Kamath [view email]
[v1] Wed, 23 Aug 2023 01:29:46 UTC (10,460 KB)
[v2] Sun, 14 Apr 2024 10:14:05 UTC (11,252 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Example-Based Framework for Perceptually Guided Audio Texture Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Example-Based Framework for Perceptually Guided Audio Texture Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators