Skip to main content

Showing 1–23 of 23 results for author: Prasad, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.12154  [pdf, ps, other

    cs.SD eess.AS

    Adapting Whisper for Streaming Speech Recognition via Two-Pass Decoding

    Authors: Haoran Zhou, Xingchen Song, Brendan Fahy, Qiaochu Song, Binbin Zhang, Zhendong Peng, Anshul Wadhawan, Denglin Jiang, Apurv Verma, Vinay Ramesh, Srivas Prasad, Michele M. Franceschini

    Abstract: OpenAI Whisper is a family of robust Automatic Speech Recognition (ASR) models trained on 680,000 hours of audio. However, its encoder-decoder architecture, trained with a sequence-to-sequence objective, lacks native support for streaming ASR. In this paper, we fine-tune Whisper for streaming ASR using the WeNet toolkit by adopting a Unified Two-pass (U2) structure. We introduce an additional Conn… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

    Comments: Accepted to INTERSPEECH 2025

  2. arXiv:2506.04539  [pdf, other

    cs.RO cs.ET cs.LG eess.SY

    Olfactory Inertial Odometry: Sensor Calibration and Drift Compensation

    Authors: Kordel K. France, Ovidiu Daescu, Anirban Paul, Shalini Prasad

    Abstract: Visual inertial odometry (VIO) is a process for fusing visual and kinematic data to understand a machine's state in a navigation task. Olfactory inertial odometry (OIO) is an analog to VIO that fuses signals from gas sensors with inertial data to help a robot navigate by scent. Gas dynamics and environmental factors introduce disturbances into olfactory navigation tasks that can make OIO difficult… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: Published as a full conference paper at the 2025 IEEE International Symposium on Inertial Sensors & Systems

  3. arXiv:2502.17459  [pdf, other

    eess.SP cs.LG

    Study on Downlink CSI compression: Are Neural Networks the Only Solution?

    Authors: K. Sai Praneeth, Anil Kumar Yerrapragada, Achyuth Sagireddi, Sai Prasad, Radha Krishna Ganti

    Abstract: Massive Multi Input Multi Output (MIMO) systems enable higher data rates in the downlink (DL) with spatial multiplexing achieved by forming narrow beams. The higher DL data rates are achieved by effective implementation of spatial multiplexing and beamforming which is subject to availability of DL channel state information (CSI) at the base station. For Frequency Division Duplexing (FDD) systems,… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  4. arXiv:2411.00254  [pdf, other

    eess.IV cs.CV cs.LG

    A Novel Breast Ultrasound Image Augmentation Method Using Advanced Neural Style Transfer: An Efficient and Explainable Approach

    Authors: Lipismita Panigrahi, Prianka Rani Saha, Jurdana Masuma Iqrah, Sushil Prasad

    Abstract: Clinical diagnosis of breast malignancy (BM) is a challenging problem in the recent era. In particular, Deep learning (DL) models have continued to offer important solutions for early BM diagnosis but their performance experiences overfitting due to the limited volume of breast ultrasound (BUS) image data. Further, large BUS datasets are difficult to manage due to privacy and legal concerns. Hence… ▽ More

    Submitted 31 October, 2024; originally announced November 2024.

  5. arXiv:2410.19436  [pdf, other

    eess.SP cs.LG

    On the Application of Deep Learning for Precise Indoor Positioning in 6G

    Authors: Sai Prasanth Kotturi, Anil Kumar Yerrapragada, Sai Prasad, Radha Krishna Ganti

    Abstract: Accurate localization in indoor environments is a challenge due to the Non Line of Sight (NLoS) nature of the signaling. In this paper, we explore the use of AI/ML techniques for positioning accuracy enhancement in Indoor Factory (InF) scenarios. The proposed neural network, which we term LocNet, is trained on measurements such as Channel Impulse Response (CIR) and Reference Signal Received Power… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

    Comments: 6 Pages, 6 Figures

  6. arXiv:2403.02909  [pdf, other

    cs.CV cs.HC eess.IV

    Gaze-Vector Estimation in the Dark with Temporally Encoded Event-driven Neural Networks

    Authors: Abeer Banerjee, Naval K. Mehta, Shyam S. Prasad, Himanshu, Sumeet Saurav, Sanjay Singh

    Abstract: In this paper, we address the intricate challenge of gaze vector prediction, a pivotal task with applications ranging from human-computer interaction to driver monitoring systems. Our innovative approach is designed for the demanding setting of extremely low-light conditions, leveraging a novel temporal event encoding scheme, and a dedicated neural network architecture. The temporal encoding metho… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  7. arXiv:2303.12719  [pdf, other

    cs.CV cs.LG eess.IV

    Toward Polar Sea-Ice Classification using Color-based Segmentation and Auto-labeling of Sentinel-2 Imagery to Train an Efficient Deep Learning Model

    Authors: Jurdana Masuma Iqrah, Younghyun Koo, Wei Wang, Hongjie Xie, Sushil Prasad

    Abstract: Global warming is an urgent issue that is generating catastrophic environmental changes, such as the melting of sea ice and glaciers, particularly in the polar regions. The melting pattern and retreat of polar sea ice cover is an essential indicator of global warming. The Sentinel-2 satellite (S2) captures high-resolution optical imagery over the polar regions. This research aims at developing a r… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

    Comments: 2nd Annual AAAI Workshop on AI to Accelerate Science and Engineering (AI2ASE), February 2023

  8. arXiv:2203.16973  [pdf, other

    cs.CL cs.SD eess.AS

    Analyzing the factors affecting usefulness of Self-Supervised Pre-trained Representations for Speech Recognition

    Authors: Ashish Seth, Lodagala V S V Durga Prasad, Sreyan Ghosh, S. Umesh

    Abstract: Self-supervised learning (SSL) to learn high-level speech representations has been a popular approach to building Automatic Speech Recognition (ASR) systems in low-resource settings. However, the common assumption made in literature is that a considerable amount of unlabeled data is available for the same domain or language that can be leveraged for SSL pre-training, which we acknowledge is not fe… ▽ More

    Submitted 17 May, 2023; v1 submitted 31 March, 2022; originally announced March 2022.

  9. arXiv:2203.16965  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations

    Authors: Lodagala V S V Durga Prasad, Sreyan Ghosh, S. Umesh

    Abstract: While self-supervised speech representation learning (SSL) models serve a variety of downstream tasks, these models have been observed to overfit to the domain from which the unlabelled data originates. To alleviate this issue, we propose PADA (Pruning Assisted Domain Adaptation) and zero out redundant weights from models pre-trained on large amounts of out-of-domain (OOD) data. Intuitively, this… ▽ More

    Submitted 13 May, 2023; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: Accepted to IEEE SLT 2022

  10. arXiv:2203.05408  [pdf, other

    cs.CR cs.AI cs.SD eess.AS

    Attacks as Defenses: Designing Robust Audio CAPTCHAs Using Attacks on Automatic Speech Recognition Systems

    Authors: Hadi Abdullah, Aditya Karlekar, Saurabh Prasad, Muhammad Sajidur Rahman, Logan Blue, Luke A. Bauer, Vincent Bindschaedler, Patrick Traynor

    Abstract: Audio CAPTCHAs are supposed to provide a strong defense for online resources; however, advances in speech-to-text mechanisms have rendered these defenses ineffective. Audio CAPTCHAs cannot simply be abandoned, as they are specifically named by the W3C as important enablers of accessibility. Accordingly, demonstrably more robust audio CAPTCHAs are important to the future of a secure and accessible… ▽ More

    Submitted 10 March, 2022; originally announced March 2022.

  11. arXiv:2202.11784  [pdf, other

    cs.RO eess.SY

    Design and experimental investigation of a vibro-impact self-propelled capsule robot with orientation control

    Authors: Jiajia Zhang, Jiyuan Tian, Dibin Zhu, Yang Liu, Shyam Prasad

    Abstract: This paper presents a novel design and experimental investigation for a self-propelled capsule robot that can be used for painless colonoscopy during a retrograde progression from the patient's rectum. The steerable robot is driven forward and backward via its internal vibration and impact with orientation control by using an electromagnetic actuator. The actuator contains four sets of coils and a… ▽ More

    Submitted 1 March, 2022; v1 submitted 23 February, 2022; originally announced February 2022.

    Comments: ICRA 2022 Conference paper

  12. arXiv:2111.15625  [pdf

    eess.SP

    Mean Square Performance of a family of Adaptive Algorithms for colored noise

    Authors: R Sankara Prasad

    Abstract: In real-time applications the characteristics and properties of a signal vary inconsistently. So, to maintain the integrity of such signals there is a need for effective adaptive filters. The conventional Least Mean Squared(LMS) algorithm is widely used because of its computational simplicity and ease of implementation. But, its convergence speed rapidly reduces when colored noise is present in th… ▽ More

    Submitted 21 November, 2021; originally announced November 2021.

  13. arXiv:2104.01793  [pdf, other

    eess.SP

    Analysis of bio-electro-chemical signals from passive sweat-based wearable electro-impedance spectroscopy (EIS) towards assessing blood glucose modulations

    Authors: Devangsingh Sankhala, Madhavi Pali, Kai-Chun Lin, Badrinath Jagannath, Sriram Muthukumar, Shalini Prasad

    Abstract: There has been a recent tremendous interest in label-free detection of biomarkers which is a critical enabler of point-of-need diagnostics. A low-power, small form factor, multiplexed wearable system is proposed for continuous detection of glucose in passively expressed sweat using electrochemical impedance spectroscopy (EIS) measurement. The wearable EIS system consists of a sensing analog front… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

  14. arXiv:2007.08592  [pdf, other

    cs.CV cs.LG eess.IV

    Advances in Deep Learning for Hyperspectral Image Analysis--Addressing Challenges Arising in Practical Imaging Scenarios

    Authors: Xiong Zhou, Saurabh Prasad

    Abstract: Deep neural networks have proven to be very effective for computer vision tasks, such as image classification, object detection, and semantic segmentation -- these are primarily applied to color imagery and video. In recent years, there has been an emergence of deep learning algorithms being applied to hyperspectral and multispectral imagery for remote sensing and biomedicine tasks. These multi-ch… ▽ More

    Submitted 16 July, 2020; originally announced July 2020.

    Comments: Published as a chapter in Hyperspectral Image Analysis. Advances in Computer Vision and Pattern Recognition

  15. arXiv:2006.02858  [pdf, other

    eess.IV math.NA physics.optics

    Point Spread Function Engineering for 3D Imaging of Space Debris using a Continuous Exact l0 Penalty (CEL0) Based Algorithm

    Authors: Chao Wang, Raymond H. Chan, Robert J. Plemmons, Sudhakar Prasad

    Abstract: We consider three-dimensional (3D) localization and imaging of space debris from only one two-dimensional (2D) snapshot image. The technique involves an optical imager that exploits off-center image rotation to encode both the lateral and depth coordinates of point sources, with the latter being encoded in the angle of rotation of the PSF. We formulate 3D localization into a large-scale sparse 3D… ▽ More

    Submitted 2 June, 2020; originally announced June 2020.

    Comments: 12 pages. arXiv admin note: substantial text overlap with arXiv:1809.10541, arXiv:1804.04000

    Journal ref: International Workshop On Image Processing and Inverse Problems (2020)

  16. arXiv:2004.06334  [pdf

    eess.IV cs.CV

    Automated Diabetic Retinopathy Grading using Deep Convolutional Neural Network

    Authors: Saket S. Chaturvedi, Kajol Gupta, Vaishali Ninawe, Prakash S. Prasad

    Abstract: Diabetic Retinopathy is a global health problem, influences 100 million individuals worldwide, and in the next few decades, these incidences are expected to reach epidemic proportions. Diabetic Retinopathy is a subtle eye disease that can cause sudden, irreversible vision loss. The early-stage Diabetic Retinopathy diagnosis can be challenging for human experts, considering the visual complexity of… ▽ More

    Submitted 14 April, 2020; originally announced April 2020.

    Comments: \c{opyright} 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  17. arXiv:2001.04940  [pdf, other

    eess.AS cs.SD

    Two Channel Audio Zooming System For Smartphone

    Authors: Anant Khandelwal, E. B. Goud, Y. Chand, L. Kumar, S. Prasad, N. Agarwala, R. Singh

    Abstract: In this paper, two microphone based systems for audio zooming is proposed for the first time. The audio zooming application allows sound capture and enhancement from the front direction while attenuating interfering sources from all other directions. The complete audio zooming system utilizes beamforming based target extraction. In particular, Minimum Power Distortionless Response (MPDR) beamforme… ▽ More

    Submitted 13 January, 2020; originally announced January 2020.

    Comments: Pre-print for WASPAA

  18. Skin Lesion Analyser: An Efficient Seven-Way Multi-Class Skin Cancer Classification Using MobileNet

    Authors: Saket S. Chaturvedi, Kajol Gupta, Prakash. S. Prasad

    Abstract: Skin cancer, a major form of cancer, is a critical public health problem with 123,000 newly diagnosed melanoma cases and between 2 and 3 million non-melanoma cases worldwide each year. The leading cause of skin cancer is high exposure of skin cells to UV radiation, which can damage the DNA inside skin cells leading to uncontrolled growth of skin cells. Skin cancer is primarily diagnosed visually e… ▽ More

    Submitted 27 May, 2020; v1 submitted 7 July, 2019; originally announced July 2019.

    Comments: This is a pre-copyedited version of a contribution published in Advances in Intelligent Systems and Computing, Hassanien A., Bhatnagar R., Darwish A. (eds) published by Chaturvedi S.S., Gupta K., Prasad P.S. The definitive authentication version is available online via https://doi.org/10.1007/978-981-15-3383-9_15

    Report number: AISC, volume 1141

    Journal ref: In: Hassanien A., Bhatnagar R., Darwish A. (eds) Advanced Machine Learning Technologies and Applications. AMLTA 2020. Advances in Intelligent Systems and Computing, vol 1141. Springer, Singapore

  19. arXiv:1907.00058  [pdf, other

    eess.IV cs.CV cs.LG

    Explainable Anatomical Shape Analysis through Deep Hierarchical Generative Models

    Authors: Carlo Biffi, Juan J. Cerrolaza, Giacomo Tarroni, Wenjia Bai, Antonio de Marvao, Ozan Oktay, Christian Ledig, Loic Le Folgoc, Konstantinos Kamnitsas, Georgia Doumou, Jinming Duan, Sanjay K. Prasad, Stuart A. Cook, Declan P. O'Regan, Daniel Rueckert

    Abstract: Quantification of anatomical shape changes currently relies on scalar global indexes which are largely insensitive to regional or asymmetric modifications. Accurate assessment of pathology-driven anatomical remodeling is a crucial step for the diagnosis and treatment of many conditions. Deep learning approaches have recently achieved wide success in the analysis of medical images, but they lack in… ▽ More

    Submitted 4 January, 2020; v1 submitted 28 June, 2019; originally announced July 2019.

    Comments: Accepted for publication in IEEE Transactions on Medical Imaging (TMI)

  20. arXiv:1906.04749  [pdf, other

    eess.IV cs.CV math.NA

    Joint 3D Localization and Classification of Space Debris using a Multispectral Rotating Point Spread Function

    Authors: Chao Wang, Grey Ballard, Robert Plemmons, Sudhakar Prasad

    Abstract: We consider the problem of joint three-dimensional (3D) localization and material classification of unresolved space debris using a multispectral rotating point spread function (RPSF). The use of RPSF allows one to estimate the 3D locations of point sources from their rotated images acquired by a single 2D sensor array, since the amount of rotation of each source image about its x, y location depe… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

    Comments: 25 pages

  21. Direction of Arrival Estimation for Nanoscale Sensor Networks

    Authors: Shree M. Prasad, Trilochan Panigrahi, Mahbub Hassan

    Abstract: Nanoscale wireless sensor networks (NWSNs) could be within reach soon using graphene-based antennas, which resonate in 0.1-10 terahertz band. To conserve the limited energy available at nanoscale, it is expected that NWSNs will communicate using extremely short pulses on the order of femtoseconds. Accurate estimation of direction of arrival (DOA) for such terahertz pulses will help realize many us… ▽ More

    Submitted 12 July, 2018; originally announced July 2018.

    Comments: 6 Pages, 9 figures, Camera Ready Version, NANOCOM '18: ACM The Fifth Annual International Conference on Nanoscale Computing and Communication, September 5--7, 2018, Reykjavik, Iceland

  22. Non-convex optimization for 3D point source localization using a rotating point spread function

    Authors: Chao Wang, Raymond Chan, Mila Nikolova, Robert Plemmons, Sudhakar Prasad

    Abstract: We consider the high-resolution imaging problem of 3D point source image recovery from 2D data using a method based on point spread function (PSF) engineering. The method involves a new technique, recently proposed by S.~Prasad, based on the use of a rotating PSF with a single lobe to obtain depth from defocus. The amount of rotation of the PSF encodes the depth position of the point source. Appli… ▽ More

    Submitted 27 September, 2018; v1 submitted 10 April, 2018; originally announced April 2018.

    Comments: 28 pages

    Journal ref: SIAM J. Imaging Sci. 2019

  23. arXiv:1706.00897  [pdf

    eess.SY

    Optimization of LMS Algorithm for System Identification

    Authors: Saurabh R. Prasad, Bhalchandra B. Godbole

    Abstract: An adaptive filter is defined as a digital filter that has the capability of self adjusting its transfer function under the control of some optimizing algorithms. Most common optimizing algorithms are Least Mean Square (LMS) and Recursive Least Square (RLS). Although RLS algorithm perform superior to LMS algorithm, it has very high computational complexity so not useful in most of the practical sc… ▽ More

    Submitted 3 June, 2017; originally announced June 2017.

    Comments: 13 pages, 6 figures, 1 table