Skip to main content

Showing 1–2 of 2 results for author: Thurairatnam, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2203.13751  [pdf, other

    cs.LG cs.CV

    Efficient-VDVAE: Less is more

    Authors: Louay Hazami, Rayhane Mama, Ragavan Thurairatnam

    Abstract: Hierarchical VAEs have emerged in recent years as a reliable option for maximum likelihood estimation. However, instability issues and demanding computational requirements have hindered research progress in the area. We present simple modifications to the Very Deep VAE to make it converge up to $2.6\times$ faster, save up to $20\times$ in memory load and improve stability during training. Despite… ▽ More

    Submitted 28 April, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

    Comments: Added more information about C1 model configuration, potential negative impact, and fixed some typos

  2. arXiv:2106.04283  [pdf, other

    cs.SD cs.AI cs.CV cs.LG eess.AS

    NWT: Towards natural audio-to-video generation with representation learning

    Authors: Rayhane Mama, Marc S. Tyndel, Hashiam Kadhim, Cole Clifford, Ragavan Thurairatnam

    Abstract: In this work we introduce NWT, an expressive speech-to-video model. Unlike approaches that use domain-specific intermediate representations such as pose keypoints, NWT learns its own latent representations, with minimal assumptions about the audio and video content. To this end, we propose a novel discrete variational autoencoder with adversarial loss, dVAE-Adv, which learns a new discrete latent… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.