Skip to main content

Showing 1–2 of 2 results for author: Agić, Ž

Searching in archive eess. Search in all archives.
.
  1. arXiv:2102.09928  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Do End-to-End Speech Recognition Models Care About Context?

    Authors: Lasse Borgholt, Jakob Drachmann Havtorn, Željko Agić, Anders Søgaard, Lars Maaløe, Christian Igel

    Abstract: The two most common paradigms for end-to-end speech recognition are connectionist temporal classification (CTC) and attention-based encoder-decoder (AED) models. It has been argued that the latter is better suited for learning an implicit language model. We test this hypothesis by measuring temporal context sensitivity and evaluate how the models perform when we constrain the amount of contextual… ▽ More

    Submitted 17 February, 2021; originally announced February 2021.

    Comments: Published in the proceedings of INTERSPEECH 2020, pp. 4352-4356

  2. arXiv:2005.00812  [pdf, other

    cs.CL cs.SD eess.AS

    MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech

    Authors: Jakob D. Havtorn, Jan Latko, Joakim Edin, Lasse Borgholt, Lars Maaløe, Lorenzo Belgrano, Nicolai F. Jacobsen, Regitze Sdun, Željko Agić

    Abstract: We address a challenging and practical task of labeling questions in speech in real time during telephone calls to emergency medical services in English, which embeds within a broader decision support system for emergency call-takers. We propose a novel multimodal approach to real-time sequence labeling in speech. Our model treats speech and its own textual representation as two separate modalitie… ▽ More

    Submitted 12 May, 2020; v1 submitted 2 May, 2020; originally announced May 2020.

    Comments: Accepted at ACL 2020