A Comparison of Adaptation Techniques and Recurrent Neural Network Architectures

Vanek, Jan; Michalek, Josef; Zelinka, Jan; Psutka, Josef

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:1807.06441 (eess)

[Submitted on 12 Jul 2018]

Title:A Comparison of Adaptation Techniques and Recurrent Neural Network Architectures

Authors:Jan Vanek, Josef Michalek, Jan Zelinka, Josef Psutka

View PDF

Abstract:Recently, recurrent neural networks have become state-of-the-art in acoustic modeling for automatic speech recognition. The long short-term memory (LSTM) units are the most popular ones. However, alternative units like gated recurrent unit (GRU) and its modifications outperformed LSTM in some publications. In this paper, we compared five neural network (NN) architectures with various adaptation and feature normalization techniques. We have evaluated feature-space maximum likelihood linear regression, five variants of i-vector adaptation and two variants of cepstral mean normalization. The most adaptation and normalization techniques were developed for feed-forward NNs and, according to results in this paper, not all of them worked also with RNNs. For experiments, we have chosen a well known and available TIMIT phone recognition task. The phone recognition is much more sensitive to the quality of AM than large vocabulary task with a complex language model. Also, we published the open-source scripts to easily replicate the results and to help continue the development.

Comments:	submitted and accepted to SLSP 2018 conference. arXiv admin note: text overlap with arXiv:1806.07186, arXiv:1806.07974
Subjects:	Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
Cite as:	arXiv:1807.06441 [eess.AS]
	(or arXiv:1807.06441v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.1807.06441

Submission history

From: Jan Vanek [view email]
[v1] Thu, 12 Jul 2018 09:40:21 UTC (25 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:A Comparison of Adaptation Techniques and Recurrent Neural Network Architectures

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:A Comparison of Adaptation Techniques and Recurrent Neural Network Architectures

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators