Skip to main content

Showing 1–3 of 3 results for author: Ohneiser, O

.
  1. arXiv:2203.16822  [pdf, other

    eess.AS cs.CL cs.LG

    How Does Pre-trained Wav2Vec 2.0 Perform on Domain Shifted ASR? An Extensive Benchmark on Air Traffic Control Communications

    Authors: Juan Zuluaga-Gomez, Amrutha Prasad, Iuliia Nigmatulina, Saeed Sarfjoo, Petr Motlicek, Matthias Kleinert, Hartmut Helmke, Oliver Ohneiser, Qingran Zhan

    Abstract: Recent work on self-supervised pre-training focus on leveraging large-scale unlabeled speech data to build robust end-to-end (E2E) acoustic models (AM) that can be later fine-tuned on downstream tasks e.g., automatic speech recognition (ASR). Yet, few works investigated the impact on performance when the data properties substantially differ between the pre-training and fine-tuning phases, termed d… ▽ More

    Submitted 17 October, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: To be published in the 2022 IEEE Spoken Language Technology Workshop (SLT) (SLT 2022)

  2. arXiv:2110.05781  [pdf, other

    eess.AS cs.CL cs.LG

    BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications

    Authors: Juan Zuluaga-Gomez, Seyyed Saeed Sarfjoo, Amrutha Prasad, Iuliia Nigmatulina, Petr Motlicek, Karel Ondrej, Oliver Ohneiser, Hartmut Helmke

    Abstract: Automatic speech recognition (ASR) allows transcribing the communications between air traffic controllers (ATCOs) and aircraft pilots. The transcriptions are used later to extract ATC named entities, e.g., aircraft callsigns. One common challenge is speech activity detection (SAD) and speaker diarization (SD). In the failure condition, two or more segments remain in the same recording, jeopardizin… ▽ More

    Submitted 14 October, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: To be published in the 2022 IEEE Spoken Language Technology Workshop (SLT) (SLT 2022)

  3. arXiv:2108.12175  [pdf, other

    cs.CL cs.LG eess.AS

    Grammar Based Speaker Role Identification for Air Traffic Control Speech Recognition

    Authors: Amrutha Prasad, Juan Zuluaga-Gomez, Petr Motlicek, Saeed Sarfjoo, Iuliia Nigmatulina, Oliver Ohneiser, Hartmut Helmke

    Abstract: Automatic Speech Recognition (ASR) for air traffic control is generally trained by pooling Air Traffic Controller (ATCO) and pilot data into one set. This is motivated by the fact that pilot's voice communications are more scarce than ATCOs. Due to this data imbalance and other reasons (e.g., varying acoustic conditions), the speech from ATCOs is usually recognized more accurately than from pilots… ▽ More

    Submitted 14 December, 2022; v1 submitted 27 August, 2021; originally announced August 2021.

    Comments: Presented at Sesar Innovation Days - 2022. See https://www.sesarju.eu/sesarinnovationdays