Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models

Fernandez-Lopez, Adriana; Liu, Shiwei; Yin, Lu; Petridis, Stavros; Pantic, Maja

Computer Science > Sound

arXiv:2410.07771 (cs)

[Submitted on 10 Oct 2024]

Title:Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models

Authors:Adriana Fernandez-Lopez, Shiwei Liu, Lu Yin, Stavros Petridis, Maja Pantic

View PDF HTML (experimental)

Abstract:This paper investigates the under-explored area of low-rank weight training for large-scale Conformer-based speech recognition models from scratch. Our study demonstrates the viability of this training paradigm for such models, yielding several notable findings. Firstly, we discover that applying a low-rank structure exclusively to the attention modules can unexpectedly enhance performance, even with a significant rank reduction of 12%. In contrast, feed-forward layers present greater challenges, as they begin to exhibit performance degradation with a moderate 50% rank reduction. Furthermore, we find that both initialization and layer-wise rank assignment play critical roles in successful low-rank training. Specifically, employing SVD initialization and linear layer-wise rank mapping significantly boosts the efficacy of low-rank weight training. Building on these insights, we introduce the Low-Rank Speech Model from Scratch (LR-SMS), an approach that achieves performance parity with full-rank training while delivering substantial reductions in parameters count (by at least 2x), and training time speedups (by 1.3x for ASR and 1.15x for AVSR).

Comments:	Submitted to ICASSP 2025
Subjects:	Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2410.07771 [cs.SD]
	(or arXiv:2410.07771v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2410.07771

Submission history

From: Adriana Fernandez Lopez [view email]
[v1] Thu, 10 Oct 2024 09:58:35 UTC (415 KB)

Computer Science > Sound

Title:Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators