Articulation GAN: Unsupervised modeling of articulatory learning

Beguš, Gašper; Zhou, Alan; Wu, Peter; Anumanchipalli, Gopala K

doi:10.1109/ICASSP49357.2023.10096800

Computer Science > Sound

arXiv:2210.15173 (cs)

[Submitted on 27 Oct 2022 (v1), last revised 12 Mar 2023 (this version, v2)]

Title:Articulation GAN: Unsupervised modeling of articulatory learning

Authors:Gašper Beguš, Alan Zhou, Peter Wu, Gopala K Anumanchipalli

View PDF

Abstract:Generative deep neural networks are widely used for speech synthesis, but most existing models directly generate waveforms or spectral outputs. Humans, however, produce speech by controlling articulators, which results in the production of speech sounds through physical properties of sound propagation. We introduce the Articulatory Generator to the Generative Adversarial Network paradigm, a new unsupervised generative model of speech production/synthesis. The Articulatory Generator more closely mimics human speech production by learning to generate articulatory representations (electromagnetic articulography or EMA) in a fully unsupervised manner. A separate pre-trained physical model (ema2wav) then transforms the generated EMA representations to speech waveforms, which get sent to the Discriminator for evaluation. Articulatory analysis suggests that the network learns to control articulators in a similar manner to humans during speech production. Acoustic analysis of the outputs suggests that the network learns to generate words that are both present and absent in the training distribution. We additionally discuss implications of articulatory representations for cognitive models of human language and speech technology in general.

Comments:	ICASSP 2023
Subjects:	Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2210.15173 [cs.SD]
	(or arXiv:2210.15173v2 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2210.15173
Journal reference:	ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing
Related DOI:	https://doi.org/10.1109/ICASSP49357.2023.10096800

Submission history

From: Gasper Begus [view email]
[v1] Thu, 27 Oct 2022 05:07:04 UTC (3,387 KB)
[v2] Sun, 12 Mar 2023 20:28:46 UTC (3,388 KB)

Computer Science > Sound

Title:Articulation GAN: Unsupervised modeling of articulatory learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Articulation GAN: Unsupervised modeling of articulatory learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators