Embeddings for DNN speaker adaptive training

Rownicka, Joanna; Bell, Peter; Renals, Steve

Computer Science > Computation and Language

arXiv:1909.13537 (cs)

[Submitted on 30 Sep 2019]

Title:Embeddings for DNN speaker adaptive training

Authors:Joanna Rownicka, Peter Bell, Steve Renals

View PDF

Abstract:In this work, we investigate the use of embeddings for speaker-adaptive training of DNNs (DNN-SAT) focusing on a small amount of adaptation data per speaker. DNN-SAT can be viewed as learning a mapping from each embedding to transformation parameters that are applied to the shared parameters of the DNN. We investigate different approaches to applying these transformations, and find that with a good training strategy, a multi-layer adaptation network applied to all hidden layers is no more effective than a single linear layer acting on the embeddings to transform the input features. In the second part of our work, we evaluate different embeddings (i-vectors, x-vectors and deep CNN embeddings) in an additional speaker recognition task in order to gain insight into what should characterize an embedding for DNN-SAT. We find the performance for speaker recognition of a given representation is not correlated with its ASR performance; in fact, ability to capture more speech attributes than just speaker identity was the most important characteristic of the embeddings for efficient DNN-SAT ASR. Our best models achieved relative WER gains of 4% and 9% over DNN baselines using speaker-level cepstral mean normalisation (CMN), and a fully speaker-independent model, respectively.

Comments:	Accepted at ASRU 2019
Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:1909.13537 [cs.CL]
	(or arXiv:1909.13537v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1909.13537

Submission history

From: Joanna Rownicka [view email]
[v1] Mon, 30 Sep 2019 09:04:16 UTC (148 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2019-09

Change to browse by:

cs
cs.SD
eess
eess.AS

References & Citations

DBLP - CS Bibliography

listing | bibtex

Joanna Rownicka
Peter Bell
Steve Renals

export BibTeX citation

Computer Science > Computation and Language

Title:Embeddings for DNN speaker adaptive training

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Embeddings for DNN speaker adaptive training

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators