Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition

Higuchi, Yosuke; Moritz, Niko; Roux, Jonathan Le; Hori, Takaaki

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2106.08922 (eess)

[Submitted on 16 Jun 2021]

Title:Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition

Authors:Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori

View PDF

Abstract:Pseudo-labeling (PL) has been shown to be effective in semi-supervised automatic speech recognition (ASR), where a base model is self-trained with pseudo-labels generated from unlabeled data. While PL can be further improved by iteratively updating pseudo-labels as the model evolves, most of the previous approaches involve inefficient retraining of the model or intricate control of the label update. We present momentum pseudo-labeling (MPL), a simple yet effective strategy for semi-supervised ASR. MPL consists of a pair of online and offline models that interact and learn from each other, inspired by the mean teacher method. The online model is trained to predict pseudo-labels generated on the fly by the offline model. The offline model maintains a momentum-based moving average of the online model. MPL is performed in a single training process and the interaction between the two models effectively helps them reinforce each other to improve the ASR performance. We apply MPL to an end-to-end ASR model based on the connectionist temporal classification. The experimental results demonstrate that MPL effectively improves over the base model and is scalable to different semi-supervised scenarios with varying amounts of data or domain mismatch.

Comments:	Accepted to Interspeech 2021
Subjects:	Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
Cite as:	arXiv:2106.08922 [eess.AS]
	(or arXiv:2106.08922v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2106.08922

Submission history

From: Niko Moritz [view email]
[v1] Wed, 16 Jun 2021 16:24:55 UTC (96 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators