Skip to main content

Showing 1–8 of 8 results for author: Hentschel, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.18508  [pdf, ps, other

    stat.ML cs.LG

    Theoretical guarantees for neural estimators in parametric statistics

    Authors: Almut Rödder, Manuel Hentschel, Sebastian Engelke

    Abstract: Neural estimators are simulation-based estimators for the parameters of a family of statistical models, which build a direct mapping from the sample to the parameter vector. They benefit from the versatility of available network architectures and efficient training methods developed in the field of deep learning. Neural estimators are amortized in the sense that, once trained, they can be applied… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  2. arXiv:2506.01263  [pdf, ps, other

    cs.CL cs.SD eess.AS

    WCTC-Biasing: Retraining-free Contextual Biasing ASR with Wildcard CTC-based Keyword Spotting and Inter-layer Biasing

    Authors: Yu Nakagome, Michael Hentschel

    Abstract: Despite recent advances in end-to-end speech recognition methods, the output tends to be biased to the training data's vocabulary, resulting in inaccurate recognition of proper nouns and other unknown terms. To address this issue, we propose a method to improve recognition accuracy of such rare words in CTC-based models without additional training or text-to-speech systems. Specifically, keyword s… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

    Comments: Accepted to Interspeech 2025

  3. arXiv:2406.14890  [pdf, other

    cs.CL eess.AS

    InterBiasing: Boost Unseen Word Recognition through Biasing Intermediate Predictions

    Authors: Yu Nakagome, Michael Hentschel

    Abstract: Despite recent advances in end-to-end speech recognition methods, their output is biased to the training data's vocabulary, resulting in inaccurate recognition of unknown terms or proper nouns. To improve the recognition accuracy for a given set of such terms, we propose an adaptation parameter-free approach based on Self-conditioned CTC. Our method improves the recognition accuracy of misrecogniz… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024

  4. arXiv:2401.11700  [pdf, other

    cs.CL cs.SD eess.AS

    Keep Decoding Parallel with Effective Knowledge Distillation from Language Models to End-to-end Speech Recognisers

    Authors: Michael Hentschel, Yuta Nishikawa, Tatsuya Komatsu, Yusuke Fujita

    Abstract: This study presents a novel approach for knowledge distillation (KD) from a BERT teacher model to an automatic speech recognition (ASR) model using intermediate layers. To distil the teacher's knowledge, we use an attention decoder that learns from BERT's token probabilities. Our method shows that language model (LM) information can be more effectively distilled into an ASR model using both the in… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: Accepted at ICASSP 2024

  5. arXiv:2310.06372  [pdf, other

    cs.CR cs.CV cs.LG

    Leveraging Diffusion-Based Image Variations for Robust Training on Poisoned Data

    Authors: Lukas Struppek, Martin B. Hentschel, Clifton Poth, Dominik Hintersdorf, Kristian Kersting

    Abstract: Backdoor attacks pose a serious security threat for training neural networks as they surreptitiously introduce hidden functionalities into a model. Such backdoors remain silent during inference on clean inputs, evading detection due to inconspicuous behavior. However, once a specific trigger pattern appears in the input data, the backdoor activates, causing the model to execute its concealed funct… ▽ More

    Submitted 13 December, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Published at NeurIPS 2023 Workshop on Backdoors in Deep Learning: The Good, the Bad, and the Ugly

  6. arXiv:2202.01405  [pdf, other

    eess.AS cs.CL cs.SD

    Joint Speech Recognition and Audio Captioning

    Authors: Chaitanya Narisetty, Emiru Tsunoo, Xuankai Chang, Yosuke Kashiwagi, Michael Hentschel, Shinji Watanabe

    Abstract: Speech samples recorded in both indoor and outdoor environments are often contaminated with secondary audio sources. Most end-to-end monaural speech recognition systems either remove these background sounds using speech enhancement or train noise-robust models. For better model interpretability and holistic understanding, we aim to bring together the growing field of automated audio captioning (AA… ▽ More

    Submitted 2 February, 2022; originally announced February 2022.

    Comments: 5 pages, 2 figures. Accepted for ICASSP 2022

  7. arXiv:2201.10190  [pdf, ps, other

    eess.AS cs.SD

    Run-and-back stitch search: novel block synchronous decoding for streaming encoder-decoder ASR

    Authors: Emiru Tsunoo, Chaitanya Narisetty, Michael Hentschel, Yosuke Kashiwagi, Shinji Watanabe

    Abstract: A streaming style inference of encoder-decoder automatic speech recognition (ASR) system is important for reducing latency, which is essential for interactive use cases. To this end, we propose a novel blockwise synchronous decoding algorithm with a hybrid approach that combines endpoint prediction and endpoint post-determination. In the endpoint prediction, we compute the expectation of the numbe… ▽ More

    Submitted 25 January, 2022; originally announced January 2022.

    Comments: Accepted for ICASSP2022

  8. arXiv:1407.6714  [pdf, other

    cs.SI

    CrowdSTAR: A Social Task Routing Framework for Online Communities

    Authors: Besmira Nushi, Omar Alonso, Martin Hentschel, Vasileios Kandylas

    Abstract: The online communities available on the Web have shown to be significantly interactive and capable of collectively solving difficult tasks. Nevertheless, it is still a challenge to decide how a task should be dispatched through the network due to the high diversity of the communities and the dynamically changing expertise and social availability of their members. We introduce CrowdSTAR, a framewor… ▽ More

    Submitted 24 July, 2014; originally announced July 2014.

    ACM Class: H.4.m; H.5.3