Skip to main content

Showing 1–2 of 2 results for author: Takano, T

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.23582  [pdf, ps, other

    cs.SD eess.AS

    RELATE: Subjective evaluation dataset for automatic evaluation of relevance between text and audio

    Authors: Yusuke Kanamori, Yuki Okamoto, Taisei Takano, Shinnosuke Takamichi, Yuki Saito, Hiroshi Saruwatari

    Abstract: In text-to-audio (TTA) research, the relevance between input text and output audio is an important evaluation aspect. Traditionally, it has been evaluated from both subjective and objective perspectives. However, subjective evaluation is costly in terms of money and time, and objective evaluation is unclear regarding the correlation to subjective evaluation scores. In this study, we construct RELA… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.

    Comments: Accepted to INTERSPEECH2025

  2. arXiv:2506.23553  [pdf, ps, other

    eess.AS cs.SD

    Human-CLAP: Human-perception-based contrastive language-audio pretraining

    Authors: Taisei Takano, Yuki Okamoto, Yusuke Kanamori, Yuki Saito, Ryotaro Nagase, Hiroshi Saruwatari

    Abstract: Contrastive language-audio pretraining (CLAP) is widely used for audio generation and recognition tasks. For example, CLAPScore, which utilizes the similarity of CLAP embeddings, has been a major metric for the evaluation of the relevance between audio and text in text-to-audio. However, the relationship between CLAPScore and human subjective evaluation scores is still unclarified. We show that CL… ▽ More

    Submitted 30 June, 2025; originally announced June 2025.