Skip to main content

Showing 1–4 of 4 results for author: Sangwan, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2008.06764  [pdf, other

    eess.AS cs.SD

    FEARLESS STEPS Challenge (FS-2): Supervised Learning with Massive Naturalistic Apollo Data

    Authors: Aditya Joglekar, John H. L. Hansen, Meena Chandra Shekar, Abhijeet Sangwan

    Abstract: The Fearless Steps Initiative by UTDallas-CRSS led to the digitization, recovery, and diarization of 19,000 hours of original analog audio data, as well as the development of algorithms to extract meaningful information from this multi-channel naturalistic data resource. The 2020 FEARLESS STEPS (FS-2) Challenge is the second annual challenge held for the Speech and Language Technology community to… ▽ More

    Submitted 15 August, 2020; originally announced August 2020.

    Comments: Paper Accepted in the Interspeech 2020 Conference

  2. arXiv:1907.05584  [pdf, other

    cs.SD cs.LG eess.AS

    Toeplitz Inverse Covariance based Robust Speaker Clustering for Naturalistic Audio Streams

    Authors: Harishchandra Dubey, Abhijeet Sangwan, John Hansen

    Abstract: Speaker diarization determines who spoke and when? in an audio stream. In this study, we propose a model-based approach for robust speaker clustering using i-vectors. The ivectors extracted from different segments of same speaker are correlated. We model this correlation with a Markov Random Field (MRF) network. Leveraging the advancements in MRF modeling, we used Toeplitz Inverse Covariance (TIC)… ▽ More

    Submitted 12 July, 2019; originally announced July 2019.

    Comments: 6 Pages, 3 Fiigures, 5 Equations

  3. arXiv:1808.06045  [pdf, other

    cs.SD eess.AS

    Robust Speaker Clustering using Mixtures of von Mises-Fisher Distributions for Naturalistic Audio Streams

    Authors: Harishchandra Dubey, Abhijeet Sangwan, John H. L. Hansen

    Abstract: Speaker Diarization (i.e. determining who spoke and when?) for multi-speaker naturalistic interactions such as Peer-Led Team Learning (PLTL) sessions is a challenging task. In this study, we propose robust speaker clustering based on mixture of multivariate von Mises-Fisher distributions. Our diarization pipeline has two stages: (i) ground-truth segmentation; (ii) proposed speaker clustering. The… ▽ More

    Submitted 18 August, 2018; originally announced August 2018.

    Comments: 5 pages, 2 figures

  4. arXiv:1806.09301  [pdf, other

    cs.SD eess.AS

    Robust Feature Clustering for Unsupervised Speech Activity Detection

    Authors: Harishchandra Dubey, Abhijeet Sangwan, John H. L. Hansen

    Abstract: In certain applications such as zero-resource speech processing or very-low resource speech-language systems, it might not be feasible to collect speech activity detection (SAD) annotations. However, the state-of-the-art supervised SAD techniques based on neural networks or other machine learning methods require annotated training data matched to the target domain. This paper establish a clusterin… ▽ More

    Submitted 25 June, 2018; originally announced June 2018.

    Comments: 5 Pages, 4 Tables, 1 Figure