Improving CTC-based ASR Models with Gated Interlayer Collaboration

Yang, Yuting; Li, Yuke; Du, Binbin

Computer Science > Computation and Language

arXiv:2205.12462 (cs)

[Submitted on 25 May 2022 (v1), last revised 14 Mar 2023 (this version, v2)]

Title:Improving CTC-based ASR Models with Gated Interlayer Collaboration

Authors:Yuting Yang, Yuke Li, Binbin Du

View PDF

Abstract:The CTC-based automatic speech recognition (ASR) models without the external language model usually lack the capacity to model conditional dependencies and textual interactions. In this paper, we present a Gated Interlayer Collaboration (GIC) mechanism to improve the performance of CTC-based models, which introduces textual information into the model and thus relaxes the conditional independence assumption of CTC-based models. Specifically, we consider the weighted sum of token embeddings as the textual representation for each position, where the position-specific weights are the softmax probability distribution constructed via inter-layer auxiliary CTC losses. The textual representations are then fused with acoustic features by developing a gate unit. Experiments on AISHELL-1, TEDLIUM2, and AIDATATANG corpora show that the proposed method outperforms several strong baselines.

Comments:	Accepted by ICASSP 2023
Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2205.12462 [cs.CL]
	(or arXiv:2205.12462v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2205.12462

Submission history

From: Yuting Yang [view email]
[v1] Wed, 25 May 2022 03:21:27 UTC (599 KB)
[v2] Tue, 14 Mar 2023 08:11:26 UTC (371 KB)

Computer Science > Computation and Language

Title:Improving CTC-based ASR Models with Gated Interlayer Collaboration

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Improving CTC-based ASR Models with Gated Interlayer Collaboration

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators