Z-Forcing: Training Stochastic Recurrent Networks

Goyal, Anirudh; Sordoni, Alessandro; Côté, Marc-Alexandre; Ke, Nan Rosemary; Bengio, Yoshua

Statistics > Machine Learning

arXiv:1711.05411 (stat)

[Submitted on 15 Nov 2017 (v1), last revised 16 Nov 2017 (this version, v2)]

Title:Z-Forcing: Training Stochastic Recurrent Networks

Authors:Anirudh Goyal, Alessandro Sordoni, Marc-Alexandre Côté, Nan Rosemary Ke, Yoshua Bengio

View PDF

Abstract:Many efforts have been devoted to training generative latent variable models with autoregressive decoders, such as recurrent neural networks (RNN). Stochastic recurrent models have been successful in capturing the variability observed in natural sequential data such as speech. We unify successful ideas from recently proposed architectures into a stochastic recurrent model: each step in the sequence is associated with a latent variable that is used to condition the recurrent dynamics for future steps. Training is performed with amortized variational inference where the approximate posterior is augmented with a RNN that runs backward through the sequence. In addition to maximizing the variational lower bound, we ease training of the latent variables by adding an auxiliary cost which forces them to reconstruct the state of the backward recurrent network. This provides the latent variables with a task-independent objective that enhances the performance of the overall model. We found this strategy to perform better than alternative approaches such as KL annealing. Although being conceptually simple, our model achieves state-of-the-art results on standard speech benchmarks such as TIMIT and Blizzard and competitive performance on sequential MNIST. Finally, we apply our model to language modeling on the IMDB dataset where the auxiliary cost helps in learning interpretable latent variables. Source Code: \url{this https URL}

Comments:	To appear in NIPS'17
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:1711.05411 [stat.ML]
	(or arXiv:1711.05411v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1711.05411

Submission history

From: Anirudh Goyal [view email]
[v1] Wed, 15 Nov 2017 05:16:49 UTC (366 KB)
[v2] Thu, 16 Nov 2017 05:10:54 UTC (367 KB)

Statistics > Machine Learning

Title:Z-Forcing: Training Stochastic Recurrent Networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Z-Forcing: Training Stochastic Recurrent Networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators