Skip to main content

Showing 1–3 of 3 results for author: Ishi, C T

Searching in archive eess. Search in all archives.
.
  1. arXiv:2307.00393  [pdf, other

    eess.AS

    Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion

    Authors: Houjian Guo, Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro

    Abstract: Voice conversion systems have made significant advancements in terms of naturalness and similarity in common voice conversion tasks. However, their performance in more complex tasks such as cross-lingual voice conversion and expressive voice conversion remains imperfect. In this study, we propose a novel approach that combines a jointly trained speaker encoder and content features extracted from t… ▽ More

    Submitted 1 July, 2023; originally announced July 2023.

  2. arXiv:2302.08296  [pdf, other

    cs.SD eess.AS

    QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion

    Authors: Houjian Guo, Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro

    Abstract: With the development of automatic speech recognition (ASR) and text-to-speech (TTS) technology, high-quality voice conversion (VC) can be achieved by extracting source content information and target speaker information to reconstruct waveforms. However, current methods still require improvement in terms of inference speed. In this study, we propose a lightweight VITS-based VC model that uses the H… ▽ More

    Submitted 23 February, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

  3. arXiv:2111.15159  [pdf, other

    cs.SD cs.LG eess.AS

    CycleTransGAN-EVC: A CycleGAN-based Emotional Voice Conversion Model with Transformer

    Authors: Changzeng Fu, Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro

    Abstract: In this study, we explore the transformer's ability to capture intra-relations among frames by augmenting the receptive field of models. Concretely, we propose a CycleGAN-based model with the transformer and investigate its ability in the emotional voice conversion task. In the training procedure, we adopt curriculum learning to gradually increase the frame length so that the model can see from th… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.