The NTNU Taiwanese ASR System for Formosa Speech Recognition Challenge 2020

Chao, Fu-An; Lo, Tien-Hong; Weng, Shi-Yan; Chiu, Shih-Hsuan; Sung, Yao-Ting; Chen, Berlin

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2104.04221 (eess)

[Submitted on 9 Apr 2021 (v1), last revised 10 Jul 2021 (this version, v4)]

Title:The NTNU Taiwanese ASR System for Formosa Speech Recognition Challenge 2020

Authors:Fu-An Chao, Tien-Hong Lo, Shi-Yan Weng, Shih-Hsuan Chiu, Yao-Ting Sung, Berlin Chen

View PDF

Abstract:This paper describes the NTNU ASR system participating in the Formosa Speech Recognition Challenge 2020 (FSR-2020) supported by the Formosa Speech in the Wild project (FSW). FSR-2020 aims at fostering the development of Taiwanese speech recognition. Apart from the issues on tonal and dialectical variations of the Taiwanese language, speech artificially contaminated with different types of real-world noise also has to be dealt with in the final test stage; all of these make FSR-2020 much more challenging than before. To work around the under-resourced issue, the main technical aspects of our ASR system include various deep learning techniques, such as transfer learning, semi-supervised learning, front-end speech enhancement and model ensemble, as well as data cleansing and data augmentation conducted on the training data. With the best configuration, our system obtains 13.1 % syllable error rate (SER) on the final-test set, achieving the first place among all participating systems on Track 3.

Comments:	17 pages, 3 figures, Accepted for publication in IJCLCLP
Subjects:	Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
Cite as:	arXiv:2104.04221 [eess.AS]
	(or arXiv:2104.04221v4 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2104.04221

Submission history

From: Fu-An Chao [view email]
[v1] Fri, 9 Apr 2021 07:26:12 UTC (411 KB)
[v2] Wed, 14 Apr 2021 07:49:43 UTC (411 KB)
[v3] Tue, 20 Apr 2021 03:51:24 UTC (411 KB)
[v4] Sat, 10 Jul 2021 03:59:53 UTC (752 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:The NTNU Taiwanese ASR System for Formosa Speech Recognition Challenge 2020

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:The NTNU Taiwanese ASR System for Formosa Speech Recognition Challenge 2020

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators