Skip to main content

Showing 1–13 of 13 results for author: Krishnan, P

Searching in archive eess. Search in all archives.
.
  1. arXiv:2106.10997  [pdf, other

    eess.AS cs.SD

    Towards sound based testing of COVID-19 -- Summary of the first Diagnostics of COVID-19 using Acoustics (DiCOVA) Challenge

    Authors: Neeraj Kumar Sharma, Ananya Muguli, Prashant Krishnan, Rohit Kumar, Srikanth Raj Chetupalli, Sriram Ganapathy

    Abstract: The technology development for point-of-care tests (POCTs) targeting respiratory diseases has witnessed a growing demand in the recent past. Investigating the presence of acoustic biomarkers in modalities such as cough, breathing and speech sounds, and using them for building POCTs can offer fast, contactless and inexpensive testing. In view of this, over the past year, we launched the ``Coswara''… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

    Comments: Manuscript in review in the Elsevier Computer Speech and Language journal

  2. arXiv:2106.00639  [pdf, other

    eess.AS cs.SD eess.SP

    Multi-modal Point-of-Care Diagnostics for COVID-19 Based On Acoustics and Symptoms

    Authors: Srikanth Raj Chetupalli, Prashant Krishnan, Neeraj Sharma, Ananya Muguli, Rohit Kumar, Viral Nanda, Lancelot Mark Pinto, Prasanta Kumar Ghosh, Sriram Ganapathy

    Abstract: The research direction of identifying acoustic bio-markers of respiratory diseases has received renewed interest following the onset of COVID-19 pandemic. In this paper, we design an approach to COVID-19 diagnostic using crowd-sourced multi-modal data. The data resource, consisting of acoustic signals like cough, breathing, and speech signals, along with the data of symptoms, are recorded using a… ▽ More

    Submitted 5 June, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

    Comments: The Manuscript is submitted to IEEE-EMBS Journal of Biomedical and Health Informatics on June 1, 2021

  3. arXiv:2103.09148  [pdf, other

    eess.AS cs.SD

    DiCOVA Challenge: Dataset, task, and baseline system for COVID-19 diagnosis using acoustics

    Authors: Ananya Muguli, Lancelot Pinto, Nirmala R., Neeraj Sharma, Prashant Krishnan, Prasanta Kumar Ghosh, Rohit Kumar, Shrirama Bhat, Srikanth Raj Chetupalli, Sriram Ganapathy, Shreyas Ramoji, Viral Nanda

    Abstract: The DiCOVA challenge aims at accelerating research in diagnosing COVID-19 using acoustics (DiCOVA), a topic at the intersection of speech and audio processing, respiratory health diagnosis, and machine learning. This challenge is an open call for researchers to analyze a dataset of sound recordings collected from COVID-19 infected and non-COVID-19 individuals for a two-class classification. These… ▽ More

    Submitted 17 June, 2021; v1 submitted 16 March, 2021; originally announced March 2021.

    Comments: To appear in Proceedings of Interspeech, 2021

  4. arXiv:2008.04527  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Neural PLDA Modeling for End-to-End Speaker Verification

    Authors: Shreyas Ramoji, Prashant Krishnan, Sriram Ganapathy

    Abstract: While deep learning models have made significant advances in supervised classification problems, the application of these models for out-of-set verification tasks like speaker recognition has been limited to deriving feature embeddings. The state-of-the-art x-vector PLDA based speaker verification systems use a generative model based on probabilistic linear discriminant analysis (PLDA) for computi… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

    Comments: Accepted in Interspeech 2020. GitHub Implementation Repos: https://github.com/iiscleap/E2E-NPLDA and https://github.com/iiscleap/NeuralPlda

  5. arXiv:2007.06021  [pdf, other

    eess.AS cs.LG

    NISP: A Multi-lingual Multi-accent Dataset for Speaker Profiling

    Authors: Shareef Babu Kalluri, Deepu Vijayasenan, Sriram Ganapathy, Ragesh Rajan M, Prashant Krishnan

    Abstract: Many commercial and forensic applications of speech demand the extraction of information about the speaker characteristics, which falls into the broad category of speaker profiling. The speaker characteristics needed for profiling include physical traits of the speaker like height, age, and gender of the speaker along with the native language of the speaker. Many of the datasets available have onl… ▽ More

    Submitted 12 July, 2020; originally announced July 2020.

    Comments: 5pages, Initial version submitted to Interspeech2020

  6. Coswara -- A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis

    Authors: Neeraj Sharma, Prashant Krishnan, Rohit Kumar, Shreyas Ramoji, Srikanth Raj Chetupalli, Nirmala R., Prasanta Kumar Ghosh, Sriram Ganapathy

    Abstract: The COVID-19 pandemic presents global challenges transcending boundaries of country, race, religion, and economy. The current gold standard method for COVID-19 detection is the reverse transcription polymerase chain reaction (RT-PCR) testing. However, this method is expensive, time-consuming, and violates social distancing. Also, as the pandemic is expected to stay for a while, there is a need for… ▽ More

    Submitted 11 August, 2020; v1 submitted 21 May, 2020; originally announced May 2020.

    Comments: A description of Coswara dataset to evaluate COVID-19 diagnosis using respiratory sounds

  7. arXiv:2002.03562  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    NPLDA: A Deep Neural PLDA Model for Speaker Verification

    Authors: Shreyas Ramoji, Prashant Krishnan, Sriram Ganapathy

    Abstract: The state-of-art approach for speaker verification consists of a neural network based embedding extractor along with a backend generative model such as the Probabilistic Linear Discriminant Analysis (PLDA). In this work, we propose a neural network approach for backend modeling in speaker recognition. The likelihood ratio score of the generative PLDA model is posed as a discriminative similarity f… ▽ More

    Submitted 24 May, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

    Comments: Published in Odyssey 2020, the Speaker and Language Recognition Workshop (VOiCES Special Session). Link to GitHub Implementation: https://github.com/iiscleap/NeuralPlda. arXiv admin note: substantial text overlap with arXiv:2001.07034

    Journal ref: in Proc. Odyssey 2020 The Speaker and Language Recognition Workshop, Pages 202-209

  8. arXiv:2002.02735  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    LEAP System for SRE19 CTS Challenge -- Improvements and Error Analysis

    Authors: Shreyas Ramoji, Prashant Krishnan, Bhargavram Mysore, Prachi Singh, Sriram Ganapathy

    Abstract: The NIST Speaker Recognition Evaluation - Conversational Telephone Speech (CTS) challenge 2019 was an open evaluation for the task of speaker verification in challenging conditions. In this paper, we provide a detailed account of the LEAP SRE system submitted to the CTS challenge focusing on the novel components in the back-end system modeling. All the systems used the time-delay neural network (T… ▽ More

    Submitted 24 May, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

    Comments: Published In Proc. Odyssey 2020, the Speaker and Language Recognition Workshop. Link to GitHub Implementation: https://github.com/iiscleap/NeuralPlda

    Journal ref: in Proc. Odyssey 2020 The Speaker and Language Recognition Workshop, 281--288

  9. arXiv:2001.07034  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Pairwise Discriminative Neural PLDA for Speaker Verification

    Authors: Shreyas Ramoji, Prashant Krishnan V, Prachi Singh, Sriram Ganapathy

    Abstract: The state-of-art approach to speaker verification involves the extraction of discriminative embeddings like x-vectors followed by a generative model back-end using a probabilistic linear discriminant analysis (PLDA). In this paper, we propose a Pairwise neural discriminative model for the task of speaker verification which operates on a pair of speaker embeddings such as x-vectors/i-vectors and ou… ▽ More

    Submitted 7 February, 2020; v1 submitted 20 January, 2020; originally announced January 2020.

    Comments: This paper was submitted to IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2020. Link to GitHub Repository: https://github.com/iiscleap/NeuralPlda

  10. SURE-fuse WFF: A Multi-resolution Windowed Fourier Analysis for Interferometric Phase Denoising

    Authors: Joshin P. Krishnan, Mário A. T. Figueiredo, José M. Bioucas-Dias

    Abstract: Interferometric phase (InPhase) imaging is an important part of many present-day coherent imaging technologies. Often in such imaging techniques, the acquired images, known as interferograms, suffer from two major degradations: 1) phase wrapping caused by the fact that the sensing mechanism can only measure sinusoidal $2π$-periodic functions of the actual phase, and 2) noise introduced by the acqu… ▽ More

    Submitted 26 February, 2019; v1 submitted 9 November, 2018; originally announced November 2018.

  11. arXiv:1810.10571  [pdf, other

    eess.SP

    Patch-based Interferometric Phase Estimation via Mixture of Gaussian Density Modelling & Non-local Averaging in the Complex Domain

    Authors: Joshin P. Krishnan, José M. Bioucas-Dias

    Abstract: This paper addresses interferometric phase (InPhase) image denoising, i.e., the denoising of phase modulo-2p images from sinusoidal 2p-periodic and noisy observations. The wrapping discontinuities present in the InPhase images, which are to be preserved carefully, make InPhase denoising a challenging inverse problem. We propose a novel two-step algorithm to tackle this problem by exploiting the no… ▽ More

    Submitted 24 October, 2018; originally announced October 2018.

    Comments: British Machine Vision Conference, 2017

  12. arXiv:1810.08090  [pdf, other

    eess.SP

    Dictionary Learning Phase Retrieval from Noisy Diffraction Patterns

    Authors: Joshin P. Krishnan, José M. Bioucas-Dias, Vladimir Katkovnik

    Abstract: This paper proposes a novel algorithm for image phase retrieval, i.e., for recovering complex-valued images from the amplitudes of noisy linear combinations (often the Fourier transform) of the sought complex images. The algorithm is developed using the alternating projection framework and is aimed to obtain high performance for heavily noisy (Poissonian or Gaussian) observations. The estimation o… ▽ More

    Submitted 18 October, 2018; originally announced October 2018.

  13. arXiv:1301.0043  [pdf, ps, other

    cs.HC cs.RO eess.SY

    A Framework for Analysing Driver Interactions with Semi-Autonomous Vehicles

    Authors: Siraj Shaikh, Padmanabhan Krishnan

    Abstract: Semi-autonomous vehicles are increasingly serving critical functions in various settings from mining to logistics to defence. A key characteristic of such systems is the presence of the human (drivers) in the control loop. To ensure safety, both the driver needs to be aware of the autonomous aspects of the vehicle and the automated features of the vehicle built to enable safer control. In this pap… ▽ More

    Submitted 31 December, 2012; originally announced January 2013.

    Comments: In Proceedings FTSCS 2012, arXiv:1212.6574

    ACM Class: H.1.2

    Journal ref: EPTCS 105, 2012, pp. 85-99