Multilingual context-based pronunciation learning for Text-to-Speech

Comini, Giulia; Ribeiro, Manuel Sam; Yang, Fan; Shim, Heereen; Lorenzo-Trueba, Jaime

Computer Science > Computation and Language

arXiv:2307.16709 (cs)

[Submitted on 31 Jul 2023]

Title:Multilingual context-based pronunciation learning for Text-to-Speech

Authors:Giulia Comini, Manuel Sam Ribeiro, Fan Yang, Heereen Shim, Jaime Lorenzo-Trueba

View PDF

Abstract:Phonetic information and linguistic knowledge are an essential component of a Text-to-speech (TTS) front-end. Given a language, a lexicon can be collected offline and Grapheme-to-Phoneme (G2P) relationships are usually modeled in order to predict the pronunciation for out-of-vocabulary (OOV) words. Additionally, post-lexical phonology, often defined in the form of rule-based systems, is used to correct pronunciation within or between words. In this work we showcase a multilingual unified front-end system that addresses any pronunciation related task, typically handled by separate modules. We evaluate the proposed model on G2P conversion and other language-specific challenges, such as homograph and polyphones disambiguation, post-lexical rules and implicit diacritization. We find that the multilingual model is competitive across languages and tasks, however, some trade-offs exists when compared to equivalent monolingual solutions.

Comments:	5 pages, 2 figures, 5 tables. Interspeech 2023
Subjects:	Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2307.16709 [cs.CL]
	(or arXiv:2307.16709v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2307.16709

Submission history

From: Manuel Sam Ribeiro [view email]
[v1] Mon, 31 Jul 2023 14:29:06 UTC (125 KB)

Computer Science > Computation and Language

Title:Multilingual context-based pronunciation learning for Text-to-Speech

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Multilingual context-based pronunciation learning for Text-to-Speech

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators