Low-Complexity Acoustic Scene Classification Using Data Augmentation and Lightweight ResNet

Li, Yanxiong; Cao, Wenchang; Xie, Wei; Huang, Qisheng; Pang, Wenfeng; He, Qianhua

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2306.02054 (eess)

[Submitted on 3 Jun 2023]

Title:Low-Complexity Acoustic Scene Classification Using Data Augmentation and Lightweight ResNet

Authors:Yanxiong Li, Wenchang Cao, Wei Xie, Qisheng Huang, Wenfeng Pang, Qianhua He

View PDF

Abstract:We present a work on low-complexity acoustic scene classification (ASC) with multiple devices, namely the subtask A of Task 1 of the DCASE2021 challenge. This subtask focuses on classifying audio samples of multiple devices with a low-complexity model, where two main difficulties need to be overcome. First, the audio samples are recorded by different devices, and there is mismatch of recording devices in audio samples. We reduce the negative impact of the mismatch of recording devices by using some effective strategies, including data augmentation (e.g., mix-up, spectrum correction, pitch shift), usages of multi-patch network structure and channel attention. Second, the model size should be smaller than a threshold (e.g., 128 KB required by the DCASE2021 challenge). To meet this condition, we adopt a ResNet with both depthwise separable convolution and channel attention as the backbone network, and perform model compression. In summary, we propose a low-complexity ASC method using data augmentation and a lightweight ResNet. Evaluated on the official development and evaluation datasets, our method obtains classification accuracy scores of 71.6% and 66.7%, respectively; and obtains Log-loss scores of 1.038 and 1.136, respectively. Our final model size is 110.3 KB which is smaller than the maximum of 128 KB.

Comments:	5 pages, 5 figures, 4 tables. Accepted for publication in the 16th IEEE International Conference on Signal Processing (IEEE ICSP)
Subjects:	Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2306.02054 [eess.AS]
	(or arXiv:2306.02054v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2306.02054

Submission history

From: Yanxiong Li [view email]
[v1] Sat, 3 Jun 2023 09:05:29 UTC (877 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Low-Complexity Acoustic Scene Classification Using Data Augmentation and Lightweight ResNet

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Low-Complexity Acoustic Scene Classification Using Data Augmentation and Lightweight ResNet

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators