Comparison of Lattice-Free and Lattice-Based Sequence Discriminative Training Criteria for LVCSR

Michel, Wilfried; Schlüter, Ralf; Ney, Hermann

doi:10.21437/Interspeech.2019-2254

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:1907.01409 (eess)

[Submitted on 1 Jul 2019]

Title:Comparison of Lattice-Free and Lattice-Based Sequence Discriminative Training Criteria for LVCSR

Authors:Wilfried Michel, Ralf Schlüter, Hermann Ney

View PDF

Abstract:Sequence discriminative training criteria have long been a standard tool in automatic speech recognition for improving the performance of acoustic models over their maximum likelihood / cross entropy trained counterparts. While previously a lattice approximation of the search space has been necessary to reduce computational complexity, recently proposed methods use other approximations to dispense of the need for the computationally expensive step of separate lattice creation.
In this work we present a memory efficient implementation of the forward-backward computation that allows us to use uni-gram word-level language models in the denominator calculation while still doing a full summation on GPU. This allows for a direct comparison of lattice-based and lattice-free sequence discriminative training criteria such as MMI and sMBR, both using the same language model during training.
We compared performance, speed of convergence, and stability on large vocabulary continuous speech recognition tasks like Switchboard and Quaero. We found that silence modeling seriously impacts the performance in the lattice-free case and needs special treatment. In our experiments lattice-free MMI comes on par with its lattice-based counterpart. Lattice-based sMBR still outperforms all lattice-free training criteria.

Comments:	Submitted to Interspeech 2019
Subjects:	Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Machine Learning (stat.ML)
Cite as:	arXiv:1907.01409 [eess.AS]
	(or arXiv:1907.01409v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.1907.01409
Journal reference:	Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, Graz, Austria, 15-19 September 2019, pp. 1601--1605
Related DOI:	https://doi.org/10.21437/Interspeech.2019-2254

Submission history

From: Wilfried Michel [view email]
[v1] Mon, 1 Jul 2019 15:16:04 UTC (17 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Comparison of Lattice-Free and Lattice-Based Sequence Discriminative Training Criteria for LVCSR

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Comparison of Lattice-Free and Lattice-Based Sequence Discriminative Training Criteria for LVCSR

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators