Skip to main content

Showing 1–6 of 6 results for author: Naithani, G

.
  1. arXiv:2310.16550  [pdf, other

    cs.SD eess.AS

    Dynamic Processing Neural Network Architecture For Hearing Loss Compensation

    Authors: Szymon Drgas, Lars Bramsløw, Archontis Politis, Gaurav Naithani, Tuomas Virtanen

    Abstract: This paper proposes neural networks for compensating sensorineural hearing loss. The aim of the hearing loss compensation task is to transform a speech signal to increase speech intelligibility after further processing by a person with a hearing impairment, which is modeled by a hearing loss model. We propose an interpretable model called dynamic processing network, which has a structure similar t… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

  2. arXiv:2208.05057  [pdf, other

    cs.SD cs.MM eess.AS

    Subjective Evaluation of Deep Neural Network Based Speech Enhancement Systems in Real-World Conditions

    Authors: Gaurav Naithani, Kirsi Pietilä, Riitta Niemistö, Erkki Paajanen, Tero Takala, Tuomas Virtanen

    Abstract: Subjective evaluation results for two low-latency deep neural networks (DNN) are compared to a matured version of a traditional Wiener-filter based noise suppressor. The target use-case is real-world single-channel speech enhancement applications, e.g., communications. Real-world recordings consisting of additive stationary and non-stationary noise types are included. The evaluation is divided int… ▽ More

    Submitted 14 August, 2022; v1 submitted 9 August, 2022; originally announced August 2022.

    Comments: Accepted for publication in IEEE MMSP 2022

  3. arXiv:2106.11794  [pdf, other

    eess.AS cs.SD

    Deep neural network Based Low-latency Speech Separation with Asymmetric analysis-Synthesis Window Pair

    Authors: Shanshan Wang, Gaurav Naithani, Archontis Politis, Tuomas Virtanen

    Abstract: Time-frequency masking or spectrum prediction computed via short symmetric windows are commonly used in low-latency deep neural network (DNN) based source separation. In this paper, we propose the usage of an asymmetric analysis-synthesis window pair which allows for training with targets with better frequency resolution, while retaining the low-latency during inference suitable for real-time spee… ▽ More

    Submitted 22 June, 2021; originally announced June 2021.

    Comments: Accepted to EUSIPCO-2021

  4. arXiv:1911.00527  [pdf, other

    eess.AS cs.LG cs.PF cs.SD

    Memory Requirement Reduction of Deep Neural Networks Using Low-bit Quantization of Parameters

    Authors: Niccoló Nicodemo, Gaurav Naithani, Konstantinos Drossos, Tuomas Virtanen, Roberto Saletti

    Abstract: Effective employment of deep neural networks (DNNs) in mobile devices and embedded systems is hampered by requirements for memory and computational power. This paper presents a non-uniform quantization approach which allows for dynamic quantization of DNN parameters for different layers and within the same layer. A virtual bit shift (VBS) scheme is also proposed to improve the accuracy of the prop… ▽ More

    Submitted 1 November, 2019; originally announced November 2019.

  5. arXiv:1902.07033  [pdf, other

    cs.SD eess.AS

    Low-Latency Deep Clustering For Speech Separation

    Authors: Shanshan Wang, Gaurav Naithani, Tuomas Virtanen

    Abstract: This paper proposes a low algorithmic latency adaptation of the deep clustering approach to speaker-independent speech separation. It consists of three parts: a) the usage of long-short-term-memory (LSTM) networks instead of their bidirectional variant used in the original work, b) using a short synthesis window (here 8 ms) required for low-latency operation, and, c) using a buffer in the beginnin… ▽ More

    Submitted 19 February, 2019; originally announced February 2019.

    Comments: To appear in ICASSP 2019

  6. arXiv:1807.06899  [pdf, other

    cs.SD eess.AS

    Deep neural network based speech separation optimizing an objective estimator of intelligibility for low latency applications

    Authors: Gaurav Naithani, Joonas Nikunen, Lars Bramsløw, Tuomas Virtanen

    Abstract: Mean square error (MSE) has been the preferred choice as loss function in the current deep neural network (DNN) based speech separation techniques. In this paper, we propose a new cost function with the aim of optimizing the extended short time objective intelligibility (ESTOI) measure. We focus on applications where low algorithmic latency ($\leq 10$ ms) is important. We use long short-term memor… ▽ More

    Submitted 18 July, 2018; originally announced July 2018.

    Comments: To appear at International Workshop on Acoustic Signal Enhancement (IWAENC) 2018