Skip to main content

Showing 1–4 of 4 results for author: Dissen, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.07566  [pdf, other

    cs.CL cs.SD eess.AS

    HebDB: a Weakly Supervised Dataset for Hebrew Speech Processing

    Authors: Arnon Turetzky, Or Tal, Yael Segal-Feldman, Yehoshua Dissen, Ella Zeldes, Amit Roth, Eyal Cohen, Yosi Shrem, Bronya R. Chernyak, Olga Seleznova, Joseph Keshet, Yossi Adi

    Abstract: We present HebDB, a weakly supervised dataset for spoken language processing in the Hebrew language. HebDB offers roughly 2500 hours of natural and spontaneous speech recordings in the Hebrew language, consisting of a large variety of speakers and topics. We provide raw recordings together with a pre-processed, weakly supervised, and filtered version. The goal of HebDB is to further enhance resear… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Accepted at Interspeech2024

  2. arXiv:2406.18928  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Enhanced ASR Robustness to Packet Loss with a Front-End Adaptation Network

    Authors: Yehoshua Dissen, Shiry Yonash, Israel Cohen, Joseph Keshet

    Abstract: In the realm of automatic speech recognition (ASR), robustness in noisy environments remains a significant challenge. Recent ASR models, such as Whisper, have shown promise, but their efficacy in noisy conditions can be further enhanced. This study is focused on recovering from packet loss to improve the word error rate (WER) of ASR models. We propose using a front-end adaptation network connected… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: Accepted for publication at INTERSPEECH 2024

  3. arXiv:2204.04166  [pdf, other

    cs.SD cs.LG eess.AS

    Self-supervised Speaker Diarization

    Authors: Yehoshua Dissen, Felix Kreuk, Joseph Keshet

    Abstract: Over the last few years, deep learning has grown in popularity for speaker verification, identification, and diarization. Inarguably, a significant part of this success is due to the demonstrated effectiveness of their speaker representations. These, however, are heavily dependent on large amounts of annotated data and can be sensitive to new domains. This study proposes an entirely unsupervised d… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

    Comments: Submitted to Interspeech 2022

  4. arXiv:1611.01783  [pdf, other

    cs.CL cs.SD

    Domain Adaptation For Formant Estimation Using Deep Learning

    Authors: Yehoshua Dissen, Joseph Keshet, Jacob Goldberger, Cynthia Clopper

    Abstract: In this paper we present a domain adaptation technique for formant estimation using a deep network. We first train a deep learning network on a small read speech dataset. We then freeze the parameters of the trained network and use several different datasets to train an adaptation layer that makes the obtained network universal in the sense that it works well for a variety of speakers and speech d… ▽ More

    Submitted 6 November, 2016; originally announced November 2016.