Skip to main content

Showing 1–1 of 1 results for author: Helwani, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2008.04470  [pdf, other

    eess.AS cs.LG cs.NE cs.SD stat.ML

    PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss

    Authors: Umut Isik, Ritwik Giri, Neerad Phansalkar, Jean-Marc Valin, Karim Helwani, Arvindh Krishnaswamy

    Abstract: Neural network applications generally benefit from larger-sized models, but for current speech enhancement models, larger scale networks often suffer from decreased robustness to the variety of real-world use cases beyond what is encountered in training data. We introduce several innovations that lead to better large neural networks for speech enhancement. The novel PoCoNet architecture is a convo… ▽ More

    Submitted 10 August, 2020; originally announced August 2020.

    Comments: 5 pages, 3 figures, INTERSPEECH 2020