Skip to main content

Showing 1–41 of 41 results for author: Reddy, K

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.04890  [pdf, ps, other

    eess.AS

    Multivariate Probabilistic Assessment of Speech Quality

    Authors: Fredrik Cumlin, Xinyu Liang, Victor Ungureanu, Chandan K. A. Reddy, Christian Schüldt, Saikat Chatterjee

    Abstract: The mean opinion score (MOS) is a standard metric for assessing speech quality, but its singular focus fails to identify specific distortions when low scores are observed. The NISQA dataset addresses this limitation by providing ratings across four additional dimensions: noisiness, coloration, discontinuity, and loudness, alongside MOS. In this paper, we extend the explored univariate MOS estimati… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: Accepted at Interspeech 2025

  2. arXiv:2504.21528  [pdf, ps, other

    eess.AS

    Impairments are Clustered in Latents of Deep Neural Network-based Speech Quality Models

    Authors: Fredrik Cumlin, Xinyu Liang, Victor Ungureanu, Chandan K. A. Reddy, Christian Schüldt, Saikat Chatterjee

    Abstract: In this article, we provide an experimental observation: Deep neural network (DNN) based speech quality assessment (SQA) models have inherent latent representations where many types of impairments are clustered. While DNN-based SQA models are not trained for impairment classification, our experiments show good impairment classification results in an appropriate SQA latent representation. We invest… ▽ More

    Submitted 30 April, 2025; originally announced April 2025.

  3. arXiv:2409.18239  [pdf, other

    cs.SD cs.LG eess.AS

    Towards Sub-millisecond Latency Real-Time Speech Enhancement Models on Hearables

    Authors: Artem Dementyev, Chandan K. A. Reddy, Scott Wisdom, Navin Chatlani, John R. Hershey, Richard F. Lyon

    Abstract: Low latency models are critical for real-time speech enhancement applications, such as hearing aids and hearables. However, the sub-millisecond latency space for resource-constrained hearables remains underexplored. We demonstrate speech enhancement using a computationally efficient minimum-phase FIR filter, enabling sample-by-sample processing to achieve mean algorithmic latency of 0.32 ms to 1.2… ▽ More

    Submitted 7 March, 2025; v1 submitted 26 September, 2024; originally announced September 2024.

  4. EEG-Based Reaction Time Prediction with Fuzzy Common Spatial Patterns and Phase Cohesion using Deep Autoencoder Based Data Fusion

    Authors: Vivek Singh, Tharun Kumar Reddy

    Abstract: Drowsiness state of a driver is a topic of extensive discussion due to its significant role in causing traffic accidents. This research presents a novel approach that combines Fuzzy Common Spatial Patterns (CSP) optimised Phase Cohesive Sequence (PCS) representations and fuzzy CSP-optimized signal amplitude representations. The research aims to examine alterations in Electroencephalogram (EEG) syn… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

  5. arXiv:2309.15216  [pdf

    cs.LG cs.CV eess.IV

    A Comparative Study of Filters and Deep Learning Models to predict Diabetic Retinopathy

    Authors: Roshan Vasu Muddaluru, Sharvaani Ravikumar Thoguluva, Shruti Prabha, Tanuja Konda Reddy, Suja Palaniswamy

    Abstract: The retina is an essential component of the visual system, and maintaining eyesight depends on the timely and accurate detection of disorders. The early-stage detection and severity classification of Diabetic Retinopathy (DR), a significant risk to the public's health is the primary goal of this work. This study compares the outcomes of various deep learning models, including InceptionNetV3, Dense… ▽ More

    Submitted 9 January, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

    Comments: 6 pages, 5 figures, I2CT , 2 tables

  6. arXiv:2309.13716  [pdf, other

    cs.CV eess.IV

    MOSAIC: Multi-Object Segmented Arbitrary Stylization Using CLIP

    Authors: Prajwal Ganugula, Y S S S Santosh Kumar, N K Sagar Reddy, Prabhath Chellingi, Avinash Thakur, Neeraj Kasera, C Shyam Anand

    Abstract: Style transfer driven by text prompts paved a new path for creatively stylizing the images without collecting an actual style image. Despite having promising results, with text-driven stylization, the user has no control over the stylization. If a user wants to create an artistic image, the user requires fine control over the stylization of various entities individually in the content image, which… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

    Comments: Camera ready, New Ideas in Vision Transformers workshop, ICCV 2023

  7. arXiv:2305.12636  [pdf, other

    eess.SY

    Wavefront Engineering: Realizing Efficient Terahertz Band Communications in 6G and Beyond

    Authors: Arjun Singh, Vitaly Petrov, Hichem Guerboukha, Innem V. A. K. Reddy, Edward W. Knightly, Daniel M. Mittleman, Josep M. Jornet

    Abstract: Terahertz (THz) band communications is envisioned as a key technology for future wireless standards. Substantial progress has been made in this field, with advances in hardware design, channel models, and signal processing. High-rate backhaul links operating at sub-THz frequencies have been experimentally demonstrated. However, there are inherent challenges in making the next great leap for adopti… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

    Comments: Accepted to IEEE Wireless Communications Magazine, 2023. \c{opyright}2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material, creating new works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  8. arXiv:2303.13024  [pdf

    cs.LG cs.AI eess.SP

    Identifying TBI Physiological States by Clustering Multivariate Clinical Time-Series Data

    Authors: Hamid Ghaderi, Brandon Foreman, Amin Nayebi, Sindhu Tipirneni, Chandan K. Reddy, Vignesh Subbian

    Abstract: Determining clinically relevant physiological states from multivariate time series data with missing values is essential for providing appropriate treatment for acute conditions such as Traumatic Brain Injury (TBI), respiratory failure, and heart failure. Utilizing non-temporal clustering or data imputation and aggregation techniques may lead to loss of valuable information and biased analyses. In… ▽ More

    Submitted 17 July, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: 10 pages, 7 figures, 2 tables

    Journal ref: AMIA Annu Symp Proc. 2024 Jan 11;2023:379-388

  9. arXiv:2212.02013  [pdf, other

    eess.AS cs.SD

    Evince the artifacts of Spoof Speech by blending Vocal Tract and Voice Source Features

    Authors: Tadipatri Uday Kiran Reddy, Sahukari Chaitanya Varun, Kota Pranav Kumar Sankala Sreekanth, Kodukula Sri Rama Murty

    Abstract: With the rapid advancement in synthetic speech generation technologies, great interest in differentiating spoof speech from the natural speech is emerging in the research community. The identification of these synthetic signals is a difficult task not only for the cutting-edge classification models but also for humans themselves. To prevent potential adverse effects, it becomes crucial to detect s… ▽ More

    Submitted 4 December, 2022; originally announced December 2022.

  10. arXiv:2209.06358  [pdf, other

    cs.SD cs.LG eess.AS

    Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset

    Authors: Michael Chinen, Jan Skoglund, Chandan K A Reddy, Alessandro Ragano, Andrew Hines

    Abstract: Non-reference speech quality models are important for a growing number of applications. The VoiceMOS 2022 challenge provided a dataset of synthetic voice conversion and text-to-speech samples with subjective labels. This study looks at the amount of variance that can be explained in subjective ratings of speech quality from metadata and the distribution imbalances of the dataset. Speech quality mo… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: Preprint; accepted for Interspeech 2022

  11. arXiv:2204.02249  [pdf, other

    eess.AS

    A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality

    Authors: Alessandro Ragano, Emmanouil Benetos, Michael Chinen, Helard B. Martinez, Chandan K. A. Reddy, Jan Skoglund, Andrew Hines

    Abstract: Speech synthesis quality prediction has made remarkable progress with the development of supervised and self-supervised learning (SSL) MOS predictors but some aspects related to the data are still unclear and require further study. In this paper, we evaluate several MOS predictors based on wav2vec 2.0 and the NISQA speech quality prediction model to explore the role of the training data, the influ… ▽ More

    Submitted 24 November, 2023; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: Accepted ISSC 2023

  12. arXiv:2110.04331  [pdf, ps, other

    eess.AS cs.SD

    MusicNet: Compact Convolutional Neural Network for Real-time Background Music Detection

    Authors: Chandan K. A. Reddy, Vishak Gopa, Harishchandra Dubey, Sergiy Matusevych, Ross Cutler, Robert Aichner

    Abstract: With the recent growth of remote work, online meetings often encounter challenging audio contexts such as background noise, music, and echo. Accurate real-time detection of music events can help to improve the user experience. In this paper, we present MusicNet, a compact neural model for detecting background music in the real-time communications pipeline. In video meetings, music frequently co-oc… ▽ More

    Submitted 15 April, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

  13. arXiv:2110.01763  [pdf, other

    eess.AS cs.SD

    DNSMOS P.835: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors

    Authors: Chandan K A Reddy, Vishak Gopal, Ross Cutler

    Abstract: Human subjective evaluation is the gold standard to evaluate speech quality optimized for human perception. Perceptual objective metrics serve as a proxy for subjective scores. We have recently developed a non-intrusive speech quality metric called Deep Noise Suppression Mean Opinion Score (DNSMOS) using the scores from ITU-T Rec. P.808 subjective evaluation. The P.808 scores reflect the overall q… ▽ More

    Submitted 4 February, 2022; v1 submitted 4 October, 2021; originally announced October 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2010.15258

  14. arXiv:2107.12719  [pdf, other

    cs.MM cs.CV cs.SD eess.AS

    The CORSMAL benchmark for the prediction of the properties of containers

    Authors: Alessio Xompero, Santiago Donaher, Vladimir Iashin, Francesca Palermo, Gökhan Solak, Claudio Coppola, Reina Ishikawa, Yuichi Nagao, Ryo Hachiuma, Qi Liu, Fan Feng, Chuanlin Lan, Rosa H. M. Chan, Guilherme Christmann, Jyun-Ting Song, Gonuguntla Neeharika, Chinnakotla Krishna Teja Reddy, Dinesh Jain, Bakhtawar Ur Rehman, Andrea Cavallaro

    Abstract: The contactless estimation of the weight of a container and the amount of its content manipulated by a person are key pre-requisites for safe human-to-robot handovers. However, opaqueness and transparencies of the container and the content, and variability of materials, shapes, and sizes, make this estimation difficult. In this paper, we present a range of methods and an open framework to benchmar… ▽ More

    Submitted 21 April, 2022; v1 submitted 27 July, 2021; originally announced July 2021.

    Comments: Authors' post-print accepted for publication in IEEE Access, see https://doi.org/10.1109/ACCESS.2022.3166906 . 14 pages, 6 tables, 7 figures

    Journal ref: IEEE Access, vol. 10, 2022, 1-15

  15. arXiv:2107.02437  [pdf, ps, other

    cs.IT eess.SP

    Turbo Coded Single User Massive MIMO

    Authors: K. Vasudevan, A. Phani Kumar Reddy, Gyanesh Kumar Pathak, Mahmoud Albreem

    Abstract: This work deals with turbo coded single user massive multiple input multiple output (SU-MMIMO) systems, with and without precoding. SU-MMIMO has a much higher spectral efficiency compared to multi-user massive MIMO (MU-MMIMO) since independent signals are transmitted from each of the antenna elements (spatial multiplexing). MU-MMIMO that uses beamforming has a much lower spectral efficiency, since… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

    Comments: 12 pages, 8 figures, journal. arXiv admin note: substantial text overlap with arXiv:2007.15959

  16. arXiv:2101.09249  [pdf, other

    eess.AS cs.SD

    Towards efficient models for real-time deep noise suppression

    Authors: Sebastian Braun, Hannes Gamper, Chandan K. A. Reddy, Ivan Tashev

    Abstract: With recent research advancements, deep learning models are becoming attractive and powerful choices for speech enhancement in real-time applications. While state-of-the-art models can achieve outstanding results in terms of speech quality and background noise reduction, the main challenge is to obtain compact enough models, which are resource efficient during inference time. An important but ofte… ▽ More

    Submitted 19 May, 2021; v1 submitted 22 January, 2021; originally announced January 2021.

  17. arXiv:2101.01902  [pdf, other

    cs.SD cs.LG eess.AS

    Interspeech 2021 Deep Noise Suppression Challenge

    Authors: Chandan K A Reddy, Harishchandra Dubey, Kazuhito Koishida, Arun Nair, Vishak Gopal, Ross Cutler, Sebastian Braun, Hannes Gamper, Robert Aichner, Sriram Srinivasan

    Abstract: The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH and ICASSP 2020. We open-sourced training and test datasets for the wideband scenario. We also open-sourced a subjective evaluation framework based on ITU-T standard P.808, wh… ▽ More

    Submitted 4 April, 2021; v1 submitted 6 January, 2021; originally announced January 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2009.06122

  18. arXiv:2010.15258  [pdf, other

    cs.SD cs.LG eess.AS

    DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors

    Authors: Chandan K A Reddy, Vishak Gopal, Ross Cutler

    Abstract: Human subjective evaluation is the gold standard to evaluate speech quality optimized for human perception. Perceptual objective metrics serve as a proxy for subjective scores. The conventional and widely used metrics require a reference clean speech signal, which is unavailable in real recordings. The no-reference approaches correlate poorly with human ratings and are not widely adopted in the re… ▽ More

    Submitted 10 February, 2021; v1 submitted 28 October, 2020; originally announced October 2020.

    Comments: Submitted to ICASSP 2020

  19. Latency Analysis for IMT-2020 Radio Interface Technology Evaluation

    Authors: A. Phani Kumar Reddy, Navin Kumar, Sri Sai Apoorva Tirumalasetty, Srinivasan S, Vinosh Babu James J

    Abstract: The International Telecommunication Union (ITU) is currently deliberating on the finalization of candidate radio interface technologies (RITs) for IMT-2020 (International Mobile Telecommunications) suitability. The candidate technologies are currently being evaluated and after a couple of ITU-Radiocommunication sector (ITU-R) working party (WP) meetings, they will become official. Although, produc… ▽ More

    Submitted 26 September, 2020; originally announced September 2020.

    Comments: Accepted in 2020 IEEE 3rd 5G World Forum (WF-5G) Conference

  20. arXiv:2009.06122  [pdf, other

    eess.AS

    ICASSP 2021 Deep Noise Suppression Challenge

    Authors: Chandan K A Reddy, Harishchandra Dubey, Vishak Gopal, Ross Cutler, Sebastian Braun, Hannes Gamper, Robert Aichner, Sriram Srinivasan

    Abstract: The Deep Noise Suppression (DNS) challenge is designed to foster innovation in the area of noise suppression to achieve superior perceptual speech quality. We recently organized a DNS challenge special session at INTERSPEECH 2020. We open sourced training and test datasets for researchers to train their noise suppression models. We also open sourced a subjective evaluation framework and used the t… ▽ More

    Submitted 26 October, 2020; v1 submitted 13 September, 2020; originally announced September 2020.

  21. arXiv:2009.05838  [pdf, other

    cs.LG eess.SY

    Guided Policy Search Based Control of a High Dimensional Advanced Manufacturing Process

    Authors: Amit Surana, Kishore Reddy, Matthew Siopis

    Abstract: In this paper we apply guided policy search (GPS) based reinforcement learning framework for a high dimensional optimal control problem arising in an additive manufacturing process. The problem comprises of controlling the process parameters so that layer-wise deposition of material leads to desired geometric characteristics of the resulting part surface while minimizing the material deposited. A… ▽ More

    Submitted 12 September, 2020; originally announced September 2020.

  22. arXiv:2007.15959  [pdf, ps, other

    cs.IT eess.SP

    Turbo Coded Single User Massive MIMO with Precoding

    Authors: K. Vasudevan, Gyanesh Kumar Pathak, A. Phani Kumar Reddy

    Abstract: Precoding is a method of compensating the channel at the transmitter. This work presents a novel method of data detection in turbo coded single user massive multiple input multiple output (MIMO) systems using precoding. We show via computer simulations that, when precoding is used, re-transmitting the data does not result in significant reduction in bit-error-rate (BER), thus increasing the spectr… ▽ More

    Submitted 31 July, 2020; originally announced July 2020.

    Comments: 6 pages, 4 figures, conference

  23. arXiv:2005.13981  [pdf

    eess.AS cs.LG cs.SD

    The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Testing Framework, and Challenge Results

    Authors: Chandan K. A. Reddy, Vishak Gopal, Ross Cutler, Ebrahim Beyrami, Roger Cheng, Harishchandra Dubey, Sergiy Matusevych, Robert Aichner, Ashkan Aazami, Sebastian Braun, Puneet Rana, Sriram Srinivasan, Johannes Gehrke

    Abstract: The INTERSPEECH 2020 Deep Noise Suppression (DNS) Challenge is intended to promote collaborative research in real-time single-channel Speech Enhancement aimed to maximize the subjective (perceptual) quality of the enhanced speech. A typical approach to evaluate the noise suppression methods is to use objective metrics on the test set obtained by splitting the original dataset. While the performanc… ▽ More

    Submitted 18 October, 2020; v1 submitted 16 May, 2020; originally announced May 2020.

    Comments: Interspeech 2020. arXiv admin note: substantial text overlap with arXiv:2001.08662

  24. arXiv:2005.06394  [pdf, other

    cs.LG eess.SP stat.ML

    A CNN-LSTM Quantifier for Single Access Point CSI Indoor Localization

    Authors: Minh Tu Hoang, Brosnan Yuen, Kai Ren, Xiaodai Dong, Tao Lu, Robert Westendorp, Kishore Reddy

    Abstract: This paper proposes a combined network structure between convolutional neural network (CNN) and long-short term memory (LSTM) quantifier for WiFi fingerprinting indoor localization. In contrast to conventional methods that utilize only spatial data with classification models, our CNN-LSTM network extracts both space and time features of the received channel state information (CSI) from a single ro… ▽ More

    Submitted 13 May, 2020; originally announced May 2020.

    Comments: Channel state information (CSI), WiFi indoor localization, convolutional neural network, long short-term memory, fingerprint-based localization

  25. arXiv:2003.11774  [pdf, other

    cs.CV cs.LG eess.IV

    Image Generation Via Minimizing Fréchet Distance in Discriminator Feature Space

    Authors: Khoa D. Doan, Saurav Manchanda, Fengjiao Wang, Sathiya Keerthi, Avradeep Bhowmik, Chandan K. Reddy

    Abstract: For a given image generation problem, the intrinsic image manifold is often low dimensional. We use the intuition that it is much better to train the GAN generator by minimizing the distributional distance between real and generated images in a small dimensional feature space representing such a manifold than on the original pixel-space. We use the feature space of the GAN discriminator for such a… ▽ More

    Submitted 30 March, 2020; v1 submitted 26 March, 2020; originally announced March 2020.

  26. arXiv:2001.10601  [pdf, other

    eess.AS cs.SD

    Weighted Speech Distortion Losses for Neural-network-based Real-time Speech Enhancement

    Authors: Yangyang Xia, Sebastian Braun, Chandan K. A. Reddy, Harishchandra Dubey, Ross Cutler, Ivan Tashev

    Abstract: This paper investigates several aspects of training a RNN (recurrent neural network) that impact the objective and subjective quality of enhanced speech for real-time single-channel speech enhancement. Specifically, we focus on a RNN that enhances short-time speech spectra on a single-frame-in, single-frame-out basis, a framework adopted by most classical signal processing methods. We propose two… ▽ More

    Submitted 12 February, 2020; v1 submitted 28 January, 2020; originally announced January 2020.

  27. arXiv:2001.09571  [pdf

    eess.AS

    Noise dependent Super Gaussian-Coherence based dual microphone Speech Enhancement for hearing aid application using smartphone

    Authors: Nikhil Shankar, Gautam S Bhat, Chandan K A Reddy, Issa Panahi

    Abstract: In this paper, the coherence between speech and noise signals is used to obtain a Speech Enhancement (SE) gain function, in combination with a Super Gaussian Joint Maximum a Posteriori (SGJMAP) single microphone SE gain function. The proposed SE method can be implemented on a smartphone that works as an assistive device to hearing aids. Although coherence SE gain function suppresses the background… ▽ More

    Submitted 26 January, 2020; originally announced January 2020.

    Comments: 4 pages, 3 figures

  28. arXiv:2001.08662  [pdf

    cs.SD cs.LG eess.AS

    The INTERSPEECH 2020 Deep Noise Suppression Challenge: Datasets, Subjective Speech Quality and Testing Framework

    Authors: Chandan K. A. Reddy, Ebrahim Beyrami, Harishchandra Dubey, Vishak Gopal, Roger Cheng, Ross Cutler, Sergiy Matusevych, Robert Aichner, Ashkan Aazami, Sebastian Braun, Puneet Rana, Sriram Srinivasan, Johannes Gehrke

    Abstract: The INTERSPEECH 2020 Deep Noise Suppression Challenge is intended to promote collaborative research in real-time single-channel Speech Enhancement aimed to maximize the subjective (perceptual) quality of the enhanced speech. A typical approach to evaluate the noise suppression methods is to use objective metrics on the test set obtained by splitting the original dataset. Many publications report r… ▽ More

    Submitted 19 April, 2020; v1 submitted 23 January, 2020; originally announced January 2020.

    Comments: Details about Deep Noise Suppression Challenge

  29. Semi-Sequential Probabilistic Model For Indoor Localization Enhancement

    Authors: Minh Tu Hoang, Brosnan Yuen, Xiaodai Dong, Tao Lu, Robert Westendorp, Kishore Reddy

    Abstract: This paper proposes a semi-sequential probabilistic model (SSP) that applies an additional short term memory to enhance the performance of the probabilistic indoor localization. The conventional probabilistic methods normally treat the locations in the database indiscriminately. In contrast, SSP leverages the information of the previous position to determine the probable location since the user's… ▽ More

    Submitted 8 January, 2020; originally announced January 2020.

    Report number: 1558-1748

    Journal ref: IEEE Sensors Journal Volume 20 Issue 11 (2020) 6160 - 6169

  30. arXiv:1909.08050  [pdf

    cs.SD cs.LG eess.AS

    A scalable noisy speech dataset and online subjective test framework

    Authors: Chandan K. A. Reddy, Ebrahim Beyrami, Jamie Pool, Ross Cutler, Sriram Srinivasan, Johannes Gehrke

    Abstract: Background noise is a major source of quality impairments in Voice over Internet Protocol (VoIP) and Public Switched Telephone Network (PSTN) calls. Recent work shows the efficacy of deep learning for noise suppression, but the datasets have been relatively small compared to those used in other domains (e.g., ImageNet) and the associated evaluations have been more focused. In order to better facil… ▽ More

    Submitted 17 September, 2019; originally announced September 2019.

    Comments: InterSpeech 2019

  31. arXiv:1909.03974  [pdf, other

    eess.AS cs.LG cs.SD

    DNN-based cross-lingual voice conversion using Bottleneck Features

    Authors: M Kiran Reddy, K Sreenivasa Rao

    Abstract: Cross-lingual voice conversion (CLVC) is a quite challenging task since the source and target speakers speak different languages. This paper proposes a CLVC framework based on bottleneck features and deep neural network (DNN). In the proposed method, the bottleneck features extracted from a deep auto-encoder (DAE) are used to represent speaker-independent features of speech signals from different… ▽ More

    Submitted 10 September, 2019; v1 submitted 9 September, 2019; originally announced September 2019.

  32. arXiv:1908.09634  [pdf, ps, other

    eess.AS cs.SD eess.SP

    Multilingual and Multimode Phone Recognition System for Indian Languages

    Authors: Kumud Tripathi, M. Kiran Reddy, K. Sreenivasa Rao

    Abstract: The aim of this paper is to develop a flexible framework capable of automatically recognizing phonetic units present in a speech utterance of any language spoken in any mode. In this study, we considered two modes of speech: conversation, and read modes in four Indian languages, namely, Telugu, Kannada, Odia, and Bengali. The proposed approach consists of two stages: (1) Automatic speech mode clas… ▽ More

    Submitted 23 August, 2019; originally announced August 2019.

    Comments: 33 pages, 5 figures, 6 tables, article

  33. arXiv:1907.01742  [pdf

    cs.SD cs.LG eess.AS

    Supervised Classifiers for Audio Impairments with Noisy Labels

    Authors: Chandan K A Reddy, Ross Cutler, Johannes Gehrke

    Abstract: Voice-over-Internet-Protocol (VoIP) calls are prone to various speech impairments due to environmental and network conditions resulting in bad user experience. A reliable audio impairment classifier helps to identify the cause for bad audio quality. The user feedback after the call can act as the ground truth labels for training a supervised classifier on a large audio dataset. However, the labels… ▽ More

    Submitted 3 July, 2019; originally announced July 2019.

    Comments: To appear in INTERSPEECH 2019

  34. arXiv:1903.11703  [pdf, other

    eess.SP cs.LG stat.ML

    Recurrent Neural Networks For Accurate RSSI Indoor Localization

    Authors: Minh Tu Hoang, Brosnan Yuen, Xiaodai Dong, Tao Lu, Robert Westendorp, Kishore Reddy

    Abstract: This paper proposes recurrent neuron networks (RNNs) for a fingerprinting indoor localization using WiFi. Instead of locating user's position one at a time as in the cases of conventional algorithms, our RNN solution aims at trajectory positioning and takes into account the relation among the received signal strength indicator (RSSI) measurements in a trajectory. Furthermore, a weighted average fi… ▽ More

    Submitted 22 October, 2019; v1 submitted 27 March, 2019; originally announced March 2019.

    Comments: Received signal strength indicator (RSSI), WiFi indoor localization, recurrent neuron network (RNN), long shortterm memory (LSTM), fingerprint-based localization

    Report number: 2327-4662

    Journal ref: IEEE Internet of Things Journal Volume 6, Issue 6 (2019) 10639 - 10651

  35. arXiv:1901.07457  [pdf, ps, other

    q-bio.QM cs.HC eess.IV eess.SP

    Divergence Framework for EEG based Multiclass Motor Imagery Brain Computer Interface

    Authors: Satyam Kumar, Tharun Kumar Reddy, Laxmidhar Behera

    Abstract: Similar to most of the real world data, the ubiquitous presence of non-stationarities in the EEG signals significantly perturb the feature distribution thus deteriorating the performance of Brain Computer Interface. In this letter, a novel method is proposed based on Joint Approximate Diagonalization (JAD) to optimize stationarity for multiclass motor imagery Brain Computer Interface (BCI) in an i… ▽ More

    Submitted 12 January, 2019; originally announced January 2019.

  36. An individualized super Gaussian single microphone Speech Enhancement for hearing aid users with smartphone as an assistive device

    Authors: Chandan K A Reddy, Nikhil Shankar, Gautam Bhat, Ram Charan, Issa Panahi

    Abstract: In this letter, we derive a new super Gaussian Joint Maximum a Posteriori based single microphone speech enhancement gain function. The developed Speech Enhancement method is implemented on a smartphone, and this arrangement functions as an assistive device to hearing aids. We introduce a tradeoff parameter in the derived gain function that allows the smartphone user to customize their listening p… ▽ More

    Submitted 10 December, 2018; originally announced December 2018.

    Comments: 5 pages

  37. arXiv:1812.03914  [pdf

    cs.SD eess.AS

    A Computationally Efficient and Practically Feasible Two Microphones Blind Speech Separation Method

    Authors: Chandan K A Reddy, Gautam Bhat, Nikhil Shankar, Issa Panahi

    Abstract: Traditionally, Blind Speech Separation techniques are computationally expensive as they update the demixing matrix at every time frame index, making them impractical to use in many Real-Time applications. In this paper, a robust data-driven two-microphone sound source localization method is used as a criterion to reduce the computational complexity of the Independent Vector Analysis (IVA) Blind Sp… ▽ More

    Submitted 10 December, 2018; originally announced December 2018.

    Comments: 5 pages

  38. arXiv:1806.07566  [pdf, other

    eess.SP

    Database Assisted Automatic Modulation Classification Using Sequential Minimal Optimization

    Authors: K. Pavan Kumar Reddy, K. Lakhan Shiva, K. Abhilash, Y. Yoganandam

    Abstract: In this paper, we have proposed a novel algorithm for identifying the modulation scheme of an unknown incoming signal in order to mitigate the interference with primary user in Cognitive Radio systems, which is facilitated by using Automatic Modulation Classification (AMC) at the front end of Software Defined Radio (SDR). In this study, we used computer simulations of analog and digital modulation… ▽ More

    Submitted 20 June, 2018; originally announced June 2018.

    Comments: 5 pages, 2 figures, 18th International Symposium on Wireless Personal Multimedia Communications, Hyderabad, India

  39. arXiv:1805.02099  [pdf

    physics.flu-dyn eess.IV physics.ins-det

    Time-resolved quantitative visualization of complex flow field emanating from an open-ended shock tube by using wavefront measuring camera

    Authors: Biswajit Medhi, Gopalakrishna M. Hegde, Kalidevapura Jagannath Reddy, Debasish Roy, Ram Mohan Vasu

    Abstract: Quantitative visualization of shock-induced complex flow field emanating from the open end of a miniaturized hand-driven shock tube (Reddy tube) is presented. During operation, the planar shock wave of Mach number Mi=1.3 is discharged through the low-pressure driven-section, kept open to ambient atmosphere. From the moment of shock discharge, its aftereffects of evolving flow field are recorded qu… ▽ More

    Submitted 28 May, 2018; v1 submitted 5 May, 2018; originally announced May 2018.

    Comments: 12 pages, 7 figures

  40. arXiv:1708.07732  [pdf, other

    eess.SY cs.AI

    Multi-Agent Q-Learning for Minimizing Demand-Supply Power Deficit in Microgrids

    Authors: Raghuram Bharadwaj Diddigi, D. Sai Koti Reddy, Shalabh Bhatnagar

    Abstract: We consider the problem of minimizing the difference in the demand and the supply of power using microgrids. We setup multiple microgrids, that provide electricity to a village. They have access to the batteries that can store renewable power and also the electrical lines from the main grid. During each time period, these microgrids need to take decision on the amount of renewable power to be used… ▽ More

    Submitted 28 August, 2017; v1 submitted 25 August, 2017; originally announced August 2017.

  41. arXiv:1702.06250  [pdf, ps, other

    eess.SY

    Generalized Deterministic Perturbations For Stochastic Gradient Search

    Authors: K. Chandramouli, K. J. Prabuchandran, D. Sai Koti Reddy, Shalabh Bhatnagar

    Abstract: Stochastic optimization (SO) considers the problem of optimizing an objective function in the presence of noise. Most of the solution techniques in SO estimate gradients from the noise corrupted observations of the objective and adjust parameters of the objective along the direction of the estimated gradients to obtain locally optimal solutions. Two prominent algorithms in SO namely Random Directi… ▽ More

    Submitted 2 August, 2018; v1 submitted 20 February, 2017; originally announced February 2017.

    Comments: Accepted in Control and Decision Conference