Locality Matters: A Locality-Biased Linear Attention for Automatic Speech Recognition

Sun, Jingyu; Zhong, Guiping; Zhou, Dinghao; Li, Baoxiang; Zhong, Yiran

Computer Science > Sound

arXiv:2203.15609 (cs)

[Submitted on 29 Mar 2022]

Title:Locality Matters: A Locality-Biased Linear Attention for Automatic Speech Recognition

Authors:Jingyu Sun, Guiping Zhong, Dinghao Zhou, Baoxiang Li, Yiran Zhong

View PDF

Abstract:Conformer has shown a great success in automatic speech recognition (ASR) on many public benchmarks. One of its crucial drawbacks is the quadratic time-space complexity with respect to the input sequence length, which prohibits the model to scale-up as well as process longer input audio sequences. To solve this issue, numerous linear attention methods have been proposed. However, these methods often have limited performance on ASR as they treat tokens equally in modeling, neglecting the fact that the neighbouring tokens are often more connected than the distanced tokens. In this paper, we take this fact into account and propose a new locality-biased linear attention for Conformer. It not only achieves higher accuracy than the vanilla Conformer, but also enjoys linear space-time computational complexity. To be specific, we replace the softmax attention with a locality-biased linear attention (LBLA) mechanism in Conformer blocks. The LBLA contains a kernel function to ensure the linear complexities and a cosine reweighing matrix to impose more weights on neighbouring tokens. Extensive experiments on the LibriSpeech corpus show that by introducing this locality bias to the Conformer, our method achieves a lower word error rate with more than 22% inference speed.

Comments:	5 pages, 2 figures, submitted to interspeech 2022
Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2203.15609 [cs.SD]
	(or arXiv:2203.15609v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2203.15609

Submission history

From: Jingyu Sun [view email]
[v1] Tue, 29 Mar 2022 14:20:00 UTC (1,359 KB)

Computer Science > Sound

Title:Locality Matters: A Locality-Biased Linear Attention for Automatic Speech Recognition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Locality Matters: A Locality-Biased Linear Attention for Automatic Speech Recognition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators