Skip to main content

Showing 1–2 of 2 results for author: Asakawa, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.01439  [pdf, ps, other

    cs.CL eess.AS

    Whale: Large-Scale multilingual ASR model with w2v-BERT and E-Branchformer with large speech data

    Authors: Yosuke Kashiwagi, Hayato Futami, Emiru Tsunoo, Satoshi Asakawa

    Abstract: This paper reports on the development of a large-scale speech recognition model, Whale. Similar to models such as Whisper and OWSM, Whale leverages both a large model size and a diverse, extensive dataset. Whale's architecture integrates w2v-BERT self-supervised model, an encoder-decoder backbone built on E-Branchformer, and a joint CTC-attention decoding strategy. The training corpus comprises va… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  2. arXiv:1905.07149  [pdf, ps, other

    eess.AS cs.CL cs.SD

    End-to-end Adaptation with Backpropagation through WFST for On-device Speech Recognition System

    Authors: Emiru Tsunoo, Yosuke Kashiwagi, Satoshi Asakawa, Toshiyuki Kumakura

    Abstract: An on-device DNN-HMM speech recognition system efficiently works with a limited vocabulary in the presence of a variety of predictable noise. In such a case, vocabulary and environment adaptation is highly effective. In this paper, we propose a novel method of end-to-end (E2E) adaptation, which adjusts not only an acoustic model (AM) but also a weighted finite-state transducer (WFST). We convert a… ▽ More

    Submitted 24 June, 2019; v1 submitted 17 May, 2019; originally announced May 2019.

    Comments: accepted for Interspeech 2019