Skip to main content

Showing 1–6 of 6 results for author: Niesler, T

Searching in archive stat. Search in all archives.
.
  1. arXiv:2004.04054  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Semi-supervised acoustic and language model training for English-isiZulu code-switched speech recognition

    Authors: A. Biswas, F. de Wet, E. van der Westhuizen, T. R. Niesler

    Abstract: We present an analysis of semi-supervised acoustic and language model training for English-isiZulu code-switched ASR using soap opera speech. Approximately 11 hours of untranscribed multilingual speech was transcribed automatically using four bilingual code-switching transcription systems operating in English-isiZulu, English-isiXhosa, English-Setswana and English-Sesotho. These transcriptions wer… ▽ More

    Submitted 5 April, 2020; originally announced April 2020.

    Comments: 4th Code-Switch workshop, France

  2. arXiv:1811.08284  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Feature exploration for almost zero-resource ASR-free keyword spotting using a multilingual bottleneck extractor and correspondence autoencoders

    Authors: Raghav Menon, Herman Kamper, Ewald van der Westhuizen, John Quinn, Thomas Niesler

    Abstract: We compare features for dynamic time warping (DTW) when used to bootstrap keyword spotting (KWS) in an almost zero-resource setting. Such quickly-deployable systems aim to support United Nations (UN) humanitarian relief efforts in parts of Africa with severely under-resourced languages. Our objective is to identify acoustic features that provide acceptable KWS performance in such environments. As… ▽ More

    Submitted 12 July, 2019; v1 submitted 14 November, 2018; originally announced November 2018.

    Comments: 5 pages, 2 figures, 2 tables, 38 references, Accepted at Interspeech 2019

  3. arXiv:1810.12744  [pdf, other

    cs.LG stat.ML

    Cluster Size Management in Multi-Stage Agglomerative Hierarchical Clustering of Acoustic Speech Segments

    Authors: Lerato Lerato, Thomas Niesler

    Abstract: Agglomerative hierarchical clustering (AHC) requires only the similarity between objects to be known. This is attractive when clustering signals of varying length, such as speech, which are not readily represented in fixed-dimensional vector space. However, AHC is characterised by $O(N^2)$ space and time complexity, making it infeasible for partitioning large datasets. This has recently been addre… ▽ More

    Submitted 30 October, 2018; originally announced October 2018.

    Comments: 12 pages

  4. arXiv:1810.12722  [pdf, ps, other

    cs.SD cs.LG eess.AS stat.ML

    Feature Trajectory Dynamic Time Warping for Clustering of Speech Segments

    Authors: Lerato Lerato, Thomas Niesler

    Abstract: Dynamic time warping (DTW) can be used to compute the similarity between two sequences of generally differing length. We propose a modification to DTW that performs individual and independent pairwise alignment of feature trajectories. The modified technique, termed feature trajectory dynamic time warping (FTDTW), is applied as a similarity measure in the agglomerative hierarchical clustering of s… ▽ More

    Submitted 30 October, 2018; originally announced October 2018.

    Comments: 10 pages

  5. arXiv:1807.08669  [pdf, other

    cs.CL stat.ML

    Automatic Speech Recognition for Humanitarian Applications in Somali

    Authors: Raghav Menon, Astik Biswas, Armin Saeb, John Quinn, Thomas Niesler

    Abstract: We present our first efforts in building an automatic speech recognition system for Somali, an under-resourced language, using 1.57 hrs of annotated speech for acoustic model training. The system is part of an ongoing effort by the United Nations (UN) to implement keyword spotting systems supporting humanitarian relief programmes in parts of Africa where languages are severely under-resourced. We… ▽ More

    Submitted 23 July, 2018; originally announced July 2018.

    Comments: 5 pages, 3 figures, 5 tables accepted at SLTU 2018

  6. arXiv:1807.08666  [pdf, other

    cs.CL stat.ML

    ASR-free CNN-DTW keyword spotting using multilingual bottleneck features for almost zero-resource languages

    Authors: Raghav Menon, Herman Kamper, Emre Yilmaz, John Quinn, Thomas Niesler

    Abstract: We consider multilingual bottleneck features (BNFs) for nearly zero-resource keyword spotting. This forms part of a United Nations effort using keyword spotting to support humanitarian relief programmes in parts of Africa where languages are severely under-resourced. We use 1920 isolated keywords (40 types, 34 minutes) as exemplars for dynamic time warping (DTW) template matching, which is perform… ▽ More

    Submitted 23 July, 2018; originally announced July 2018.

    Comments: 5 pages, 3 figures, 3 tables, 1 equation accepted at SLTU 2018