Speech vocoding for laboratory phonology

Cernak, Milos; Benus, Stefan; Lazaridis, Alexandros

doi:10.1016/j.csl.2016.10.001

Computer Science > Computation and Language

arXiv:1601.05991 (cs)

[Submitted on 22 Jan 2016 (v1), last revised 15 Sep 2016 (this version, v3)]

Title:Speech vocoding for laboratory phonology

Authors:Milos Cernak, Stefan Benus, Alexandros Lazaridis

View PDF

Abstract:Using phonological speech vocoding, we propose a platform for exploring relations between phonology and speech processing, and in broader terms, for exploring relations between the abstract and physical structures of a speech signal. Our goal is to make a step towards bridging phonology and speech processing and to contribute to the program of Laboratory Phonology. We show three application examples for laboratory phonology: compositional phonological speech modelling, a comparison of phonological systems and an experimental phonological parametric text-to-speech (TTS) system. The featural representations of the following three phonological systems are considered in this work: (i) Government Phonology (GP), (ii) the Sound Pattern of English (SPE), and (iii) the extended SPE (eSPE). Comparing GP- and eSPE-based vocoded speech, we conclude that the latter achieves slightly better results than the former. However, GP - the most compact phonological speech representation - performs comparably to the systems with a higher number of phonological features. The parametric TTS based on phonological speech representation, and trained from an unlabelled audiobook in an unsupervised manner, achieves intelligibility of 85% of the state-of-the-art parametric speech synthesis. We envision that the presented approach paves the way for researchers in both fields to form meaningful hypotheses that are explicitly testable using the concepts developed and exemplified in this paper. On the one hand, laboratory phonologists might test the applied concepts of their theoretical models, and on the other hand, the speech processing community may utilize the concepts developed for the theoretical phonological models for improvements of the current state-of-the-art applications.

Subjects:	Computation and Language (cs.CL); Sound (cs.SD)
Report number:	Idiap-RR-07-2016
Cite as:	arXiv:1601.05991 [cs.CL]
	(or arXiv:1601.05991v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1601.05991
Journal reference:	Computer Speech & Language, Volume 42, March 2017, Pages 100-121
Related DOI:	https://doi.org/10.1016/j.csl.2016.10.001

Submission history

From: Milos Cernak [view email]
[v1] Fri, 22 Jan 2016 13:22:10 UTC (2,015 KB)
[v2] Mon, 18 Apr 2016 12:06:21 UTC (2,016 KB)
[v3] Thu, 15 Sep 2016 08:26:38 UTC (2,087 KB)

Computer Science > Computation and Language

Title:Speech vocoding for laboratory phonology

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Speech vocoding for laboratory phonology

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators