Broadcasted Residual Learning for Efficient Keyword Spotting

Kim, Byeonggeun; Chang, Simyung; Lee, Jinkyu; Sung, Dooyong

Computer Science > Sound

arXiv:2106.04140 (cs)

[Submitted on 8 Jun 2021 (v1), last revised 5 Jul 2023 (this version, v4)]

Title:Broadcasted Residual Learning for Efficient Keyword Spotting

Authors:Byeonggeun Kim, Simyung Chang, Jinkyu Lee, Dooyong Sung

View PDF

Abstract:Keyword spotting is an important research field because it plays a key role in device wake-up and user interaction on smart devices. However, it is challenging to minimize errors while operating efficiently in devices with limited resources such as mobile phones. We present a broadcasted residual learning method to achieve high accuracy with small model size and computational load. Our method configures most of the residual functions as 1D temporal convolution while still allows 2D convolution together using a broadcasted-residual connection that expands temporal output to frequency-temporal dimension. This residual mapping enables the network to effectively represent useful audio features with much less computation than conventional convolutional neural networks. We also propose a novel network architecture, Broadcasting-residual network (BC-ResNet), based on broadcasted residual learning and describe how to scale up the model according to the target device's resources. BC-ResNets achieve state-of-the-art 98.0% and 98.7% top-1 accuracy on Google speech command datasets v1 and v2, respectively, and consistently outperform previous approaches, using fewer computations and parameters. Code is available at this https URL.

Comments:	Proceedings of INTERSPEECH 2021
Subjects:	Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2106.04140 [cs.SD]
	(or arXiv:2106.04140v4 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2106.04140

Submission history

From: Byeonggeun Kim [view email]
[v1] Tue, 8 Jun 2021 06:55:39 UTC (1,416 KB)
[v2] Wed, 30 Jun 2021 02:56:06 UTC (1,417 KB)
[v3] Thu, 27 Oct 2022 12:50:28 UTC (1,416 KB)
[v4] Wed, 5 Jul 2023 15:18:54 UTC (1,416 KB)

Computer Science > Sound

Title:Broadcasted Residual Learning for Efficient Keyword Spotting

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Broadcasted Residual Learning for Efficient Keyword Spotting

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators