Skip to main content

Showing 1–5 of 5 results for author: Ishiguro, H

Searching in archive eess. Search in all archives.
.
  1. Quadrupedal Spine Control Strategies: Exploring Correlations Between System Dynamic Responses and Human Perspectives

    Authors: Nicholas Hafner, Chaoran Liu, Carlos Ishi, Hiroshi Ishiguro

    Abstract: Unlike their biological cousins, the majority of existing quadrupedal robots are constructed with rigid chassis. This results in motion that is either beetle-like or distinctly robotic, lacking the natural fluidity characteristic of mammalian movements. Existing literature on quadrupedal robots with spinal configurations primarily focuses on energy efficiency and does not consider the effects in h… ▽ More

    Submitted 5 May, 2025; originally announced May 2025.

    Comments: 27 pages, 13 figures

    Journal ref: Advanced Robotics, 39(6), 273-290 (2025)

  2. arXiv:2307.00393  [pdf, other

    eess.AS

    Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion

    Authors: Houjian Guo, Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro

    Abstract: Voice conversion systems have made significant advancements in terms of naturalness and similarity in common voice conversion tasks. However, their performance in more complex tasks such as cross-lingual voice conversion and expressive voice conversion remains imperfect. In this study, we propose a novel approach that combines a jointly trained speaker encoder and content features extracted from t… ▽ More

    Submitted 1 July, 2023; originally announced July 2023.

  3. arXiv:2303.00146  [pdf, other

    cs.HC cs.RO cs.SD eess.AS

    I Know Your Feelings Before You Do: Predicting Future Affective Reactions in Human-Computer Dialogue

    Authors: Yuanchao Li, Koji Inoue, Leimin Tian, Changzeng Fu, Carlos Ishi, Hiroshi Ishiguro, Tatsuya Kawahara, Catherine Lai

    Abstract: Current Spoken Dialogue Systems (SDSs) often serve as passive listeners that respond only after receiving user speech. To achieve human-like dialogue, we propose a novel future prediction architecture that allows an SDS to anticipate future affective reactions based on its current behaviors before the user speaks. In this work, we investigate two scenarios: speech and laughter. In speech, we propo… ▽ More

    Submitted 17 December, 2024; v1 submitted 28 February, 2023; originally announced March 2023.

    Comments: Accepted to CHI2023 Late-Breaking Work

  4. arXiv:2302.08296  [pdf, other

    cs.SD eess.AS

    QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion

    Authors: Houjian Guo, Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro

    Abstract: With the development of automatic speech recognition (ASR) and text-to-speech (TTS) technology, high-quality voice conversion (VC) can be achieved by extracting source content information and target speaker information to reconstruct waveforms. However, current methods still require improvement in terms of inference speed. In this study, we propose a lightweight VITS-based VC model that uses the H… ▽ More

    Submitted 23 February, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

  5. arXiv:2111.15159  [pdf, other

    cs.SD cs.LG eess.AS

    CycleTransGAN-EVC: A CycleGAN-based Emotional Voice Conversion Model with Transformer

    Authors: Changzeng Fu, Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro

    Abstract: In this study, we explore the transformer's ability to capture intra-relations among frames by augmenting the receptive field of models. Concretely, we propose a CycleGAN-based model with the transformer and investigate its ability in the emotional voice conversion task. In the training procedure, we adopt curriculum learning to gradually increase the frame length so that the model can see from th… ▽ More

    Submitted 30 November, 2021; originally announced November 2021.