Skip to main content

Showing 1–2 of 2 results for author: Menne, T

Searching in archive cs. Search in all archives.
.
  1. Analysis of Deep Clustering as Preprocessing for Automatic Speech Recognition of Sparsely Overlapping Speech

    Authors: Tobias Menne, Ilya Sklyar, Ralf Schlüter, Hermann Ney

    Abstract: Significant performance degradation of automatic speech recognition (ASR) systems is observed when the audio signal contains cross-talk. One of the recently proposed approaches to solve the problem of multi-speaker ASR is the deep clustering (DPCL) approach. Combining DPCL with a state-of-the-art hybrid acoustic model, we obtain a word error rate (WER) of 16.5 % on the commonly used wsj0-2mix data… ▽ More

    Submitted 25 September, 2019; v1 submitted 9 May, 2019; originally announced May 2019.

    Journal ref: Proceedings of INTERSPEECH 2019

  2. arXiv:1806.07407  [pdf, other

    cs.CL cs.SD eess.AS

    Speaker Adapted Beamforming for Multi-Channel Automatic Speech Recognition

    Authors: Tobias Menne, Ralf Schlüter, Hermann Ney

    Abstract: This paper presents, in the context of multi-channel ASR, a method to adapt a mask based, statistically optimal beamforming approach to a speaker of interest. The beamforming vector of the statistically optimal beamformer is computed by utilizing speech and noise masks, which are estimated by a neural network. The proposed adaptation approach is based on the integration of the beamformer, which in… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

    Comments: submitted to IEEE SLT 2018