Skip to main content

Showing 1–4 of 4 results for author: Beneš, K

Searching in archive eess. Search in all archives.
.
  1. arXiv:2410.17437  [pdf, other

    eess.AS

    Improving Automatic Speech Recognition with Decoder-Centric Regularisation in Encoder-Decoder Models

    Authors: Alexander Polok, Santosh Kesiraju, Karel Beneš, Lukáš Burget, Jan Černocký

    Abstract: This paper proposes a simple yet effective way of regularising the encoder-decoder-based automatic speech recognition (ASR) models that enhance the robustness of the model and improve the generalisation to out-of-domain scenarios. The proposed approach is dubbed as $\textbf{De}$coder-$\textbf{C}$entric $\textbf{R}$egularisation in $\textbf{E}$ncoder-$\textbf{D}$ecoder (DeCRED) architecture for ASR… ▽ More

    Submitted 22 October, 2024; originally announced October 2024.

  2. arXiv:2310.11921  [pdf, other

    cs.SD eess.AS

    BUT CHiME-7 system description

    Authors: Martin Karafiát, Karel Veselý, Igor Szöke, Ladislav Mošner, Karel Beneš, Marcin Witkowski, Germán Barchi, Leonardo Pepino

    Abstract: This paper describes the joint effort of Brno University of Technology (BUT), AGH University of Krakow and University of Buenos Aires on the development of Automatic Speech Recognition systems for the CHiME-7 Challenge. We train and evaluate various end-to-end models with several toolkits. We heavily relied on Guided Source Separation (GSS) to convert multi-channel audio to single channel. The ASR… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: 6 pages, Chime-7 challenge 2023

  3. arXiv:2305.12579  [pdf, other

    cs.CL cs.SD eess.AS

    Hystoc: Obtaining word confidences for fusion of end-to-end ASR systems

    Authors: Karel Beneš, Martin Kocour, Lukáš Burget

    Abstract: End-to-end (e2e) systems have recently gained wide popularity in automatic speech recognition. However, these systems do generally not provide well-calibrated word-level confidences. In this paper, we propose Hystoc, a simple method for obtaining word-level confidences from hypothesis-level scores. Hystoc is an iterative alignment procedure which turns hypotheses from an n-best output of the ASR s… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

  4. arXiv:2003.10778  [pdf, other

    eess.IV cs.CV q-bio.QM

    PanNuke Dataset Extension, Insights and Baselines

    Authors: Jevgenij Gamper, Navid Alemi Koohbanani, Ksenija Benes, Simon Graham, Mostafa Jahanifar, Syed Ali Khurram, Ayesha Azam, Katherine Hewitt, Nasir Rajpoot

    Abstract: The emerging area of computational pathology (CPath) is ripe ground for the application of deep learning (DL) methods to healthcare due to the sheer volume of raw pixel data in whole-slide images (WSIs) of cancerous tissue slides. However, it is imperative for the DL algorithms relying on nuclei-level details to be able to cope with data from `the clinical wild', which tends to be quite challengin… ▽ More

    Submitted 22 April, 2020; v1 submitted 24 March, 2020; originally announced March 2020.

    Comments: Work in progress