Learning to Recognize Code-switched Speech Without Forgetting Monolingual Speech Recognition

Shah, Sanket; Abraham, Basil; M, Gurunath Reddy; Sitaram, Sunayana; Joshi, Vikas

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2006.00782 (eess)

[Submitted on 1 Jun 2020]

Title:Learning to Recognize Code-switched Speech Without Forgetting Monolingual Speech Recognition

Authors:Sanket Shah, Basil Abraham, Gurunath Reddy M, Sunayana Sitaram, Vikas Joshi

View PDF

Abstract:Recently, there has been significant progress made in Automatic Speech Recognition (ASR) of code-switched speech, leading to gains in accuracy on code-switched datasets in many language pairs. Code-switched speech co-occurs with monolingual speech in one or both languages being mixed. In this work, we show that fine-tuning ASR models on code-switched speech harms performance on monolingual speech. We point out the need to optimize models for code-switching while also ensuring that monolingual performance is not sacrificed. Monolingual models may be trained on thousands of hours of speech which may not be available for re-training a new model. We propose using the Learning Without Forgetting (LWF) framework for code-switched ASR when we only have access to a monolingual model and do not have the data it was trained on. We show that it is possible to train models using this framework that perform well on both code-switched and monolingual test sets. In cases where we have access to monolingual training data as well, we propose regularization strategies for fine-tuning models for code-switching without sacrificing monolingual accuracy. We report improvements in Word Error Rate (WER) in monolingual and code-switched test sets compared to baselines that use pooled data and simple fine-tuning.

Comments:	5 pages (4 pages + 1 page references), 5 tables, 1 figure, 1 algorithm, 16 references
Subjects:	Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
Cite as:	arXiv:2006.00782 [eess.AS]
	(or arXiv:2006.00782v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2006.00782

Submission history

From: Sanket Shah [view email]
[v1] Mon, 1 Jun 2020 08:16:24 UTC (217 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Learning to Recognize Code-switched Speech Without Forgetting Monolingual Speech Recognition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Learning to Recognize Code-switched Speech Without Forgetting Monolingual Speech Recognition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators