Skip to main content

Showing 1–3 of 3 results for author: Beliaev, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2104.08189  [pdf, other

    eess.AS cs.AI

    TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction

    Authors: Stanislav Beliaev, Boris Ginsburg

    Abstract: We propose TalkNet, a non-autoregressive convolutional neural model for speech synthesis with explicit pitch and duration prediction. The model consists of three feed-forward convolutional networks. The first network predicts grapheme durations. An input text is expanded by repeating each symbol according to the predicted duration. The second network predicts pitch value for every mel frame. The t… ▽ More

    Submitted 17 June, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2005.05514

  2. arXiv:2005.07815  [pdf, other

    eess.AS cs.SD

    ConVoice: Real-Time Zero-Shot Voice Style Transfer with Convolutional Network

    Authors: Yurii Rebryk, Stanislav Beliaev

    Abstract: We propose a neural network for zero-shot voice conversion (VC) without any parallel or transcribed data. Our approach uses pre-trained models for automatic speech recognition (ASR) and speaker embedding, obtained from a speaker verification task. Our model is fully convolutional and non-autoregressive except for a small pre-trained recurrent neural network for speaker encoding. ConVoice can conve… ▽ More

    Submitted 15 May, 2020; originally announced May 2020.

  3. arXiv:1909.09577  [pdf, other

    cs.LG cs.CL cs.SD eess.AS

    NeMo: a toolkit for building AI applications using Neural Modules

    Authors: Oleksii Kuchaiev, Jason Li, Huyen Nguyen, Oleksii Hrinchuk, Ryan Leary, Boris Ginsburg, Samuel Kriman, Stanislav Beliaev, Vitaly Lavrukhin, Jack Cook, Patrice Castonguay, Mariya Popova, Jocelyn Huang, Jonathan M. Cohen

    Abstract: NeMo (Neural Modules) is a Python framework-agnostic toolkit for creating AI applications through re-usability, abstraction, and composition. NeMo is built around neural modules, conceptual blocks of neural networks that take typed inputs and produce typed outputs. Such modules typically represent data layers, encoders, decoders, language models, loss functions, or methods of combining activations… ▽ More

    Submitted 13 September, 2019; originally announced September 2019.

    Comments: 6 pages plus references