Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition

Xue, Jiabin; Han, Jiqing; Zheng, Tieran; Guo, Jiaxing; Wu, Boyong

Computer Science > Sound

arXiv:1904.08031 (cs)

[Submitted on 17 Apr 2019]

Title:Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition

Authors:Jiabin Xue, Jiqing Han, Tieran Zheng, Jiaxing Guo, Boyong Wu

View PDF

Abstract:It is an effective way that improves the performance of the existing Automatic Speech Recognition (ASR) systems by retraining with more and more new training data in the target domain. Recently, Deep Neural Network (DNN) has become a successful model in the ASR field. In the training process of the DNN based methods, a back propagation of error between the transcription and the corresponding annotated text is used to update and optimize the parameters. Thus, the parameters are more influenced by the training samples with a big propagation error than the samples with a small one. In this paper, we define the samples with significant error as the hard samples and try to improve the performance of the ASR system by adding many of them. Unfortunately, the hard samples are sparse in the training data of the target domain, and manually label them is expensive. Therefore, we propose a hard samples mining method based on an enhanced deep multiple instance learning, which can find the hard samples from unlabeled training data by using a small subset of the dataset with manual labeling in the target domain. We applied our method to an End2End ASR task and obtained the best performance.

Comments:	Submitted to Interspeech 2019;
Subjects:	Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:1904.08031 [cs.SD]
	(or arXiv:1904.08031v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.1904.08031

Submission history

From: Jiabin Xue [view email]
[v1] Wed, 17 Apr 2019 00:39:35 UTC (2,337 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.SD

< prev | next >

new | recent | 2019-04

Change to browse by:

cs
cs.LG
eess
eess.AS

References & Citations

DBLP - CS Bibliography

listing | bibtex

Jiabin Xue
Jiqing Han
Tieran Zheng
Jiaxing Guo
Boyong Wu

export BibTeX citation

Computer Science > Sound

Title:Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators