Skip to main content

Showing 1–22 of 22 results for author: Tripathi, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2504.01939  [pdf, ps, other

    eess.SP

    Laboratory evaluation of a wearable instrumented headband for rotational head kinematics measurement

    Authors: Anu Tripathi, Yang Wan, Sushant Malave, Sheila Turcsanyi, Alice Lux Fawzi, Alison Brooks, Haneesh Kesari, Traci Snedden, Peter Ferrazzano, Christian Franck, Rika Carlsen

    Abstract: Mild traumatic brain injuries (mTBI) are a highly prevalent condition with heterogeneous outcomes between individuals. A key factor governing brain tissue deformation and the risk of mTBI is the rotational kinematics of the head. Instrumented mouthguards are a widely accepted method for measuring rotational head motions, owing to their robust sensor-skull coupling. However, wearing mouthguards is… ▽ More

    Submitted 2 April, 2025; originally announced April 2025.

    Comments: 22 pages, 8 figures, 1 table

  2. arXiv:2408.10816  [pdf, other

    eess.SP cs.LG

    Deep Learning-based Classification of Dementia using Image Representation of Subcortical Signals

    Authors: Shivani Ranjan, Ayush Tripathi, Harshal Shende, Robin Badal, Amit Kumar, Pramod Yadav, Deepak Joshi, Lalan Kumar

    Abstract: Dementia is a neurological syndrome marked by cognitive decline. Alzheimer's disease (AD) and Frontotemporal dementia (FTD) are the common forms of dementia, each with distinct progression patterns. EEG, a non-invasive tool for recording brain activity, has shown potential in distinguishing AD from FTD and mild cognitive impairment (MCI). Previous studies have utilized various EEG features, such a… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  3. arXiv:2408.02582  [pdf, other

    cs.SD cs.AI eess.AS

    Clustering and Mining Accented Speech for Inclusive and Fair Speech Recognition

    Authors: Jaeyoung Kim, Han Lu, Soheil Khorram, Anshuman Tripathi, Qian Zhang, Hasim Sak

    Abstract: Modern automatic speech recognition (ASR) systems are typically trained on more than tens of thousands hours of speech data, which is one of the main factors for their great success. However, the distribution of such data is typically biased towards common accents or typical speech patterns. As a result, those systems often poorly perform on atypical accented speech. In this paper, we present acce… ▽ More

    Submitted 5 August, 2024; originally announced August 2024.

  4. arXiv:2311.15752  [pdf, other

    eess.SP

    Insights into Age-Related Functional Brain Changes during Audiovisual Integration Tasks: A Comprehensive EEG Source-Based Analysis

    Authors: Prerna Singh, Ayush Tripathi, Lalan Kumar, Tapan Kumar Gandhi

    Abstract: The seamless integration of visual and auditory information is a fundamental aspect of human cognition. Although age-related functional changes in Audio-Visual Integration (AVI) have been extensively explored in the past, thorough studies across various age groups remain insufficient. Previous studies have provided valuable insights into agerelated AVI using EEG-based sensor data. However, these s… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  5. arXiv:2304.06315  [pdf, other

    eess.SP cs.SD eess.AS q-bio.NC

    Brain Connectivity Features-based Age Group Classification using Temporal Asynchrony Audio-Visual Integration Task

    Authors: Prerna Singh, Ayush Tripathi, Lalan Kumar, Tapan Kumar Gandhi

    Abstract: The process of integration of inputs from several sensory modalities in the human brain is referred to as multisensory integration. Age-related cognitive decline leads to a loss in the ability of the brain to conceive multisensory inputs. There has been considerable work done in the study of such cognitive changes for the old age groups. However, in the case of middle age groups, such analysis is… ▽ More

    Submitted 1 May, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

  6. arXiv:2203.15250  [pdf, other

    eess.SP cs.CL cs.SD eess.AS q-bio.NC

    Analysis of EEG frequency bands for Envisioned Speech Recognition

    Authors: Ayush Tripathi

    Abstract: The use of Automatic speech recognition (ASR) interfaces have become increasingly popular in daily life for use in interaction and control of electronic devices. The interfaces currently being used are not feasible for a variety of users such as those suffering from a speech disorder, locked-in syndrome, paralysis or people with utmost privacy requirements. In such cases, an interface that can ide… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

  7. arXiv:2109.11641  [pdf, other

    eess.AS cs.LG cs.SD

    Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection

    Authors: Wei Xia, Han Lu, Quan Wang, Anshuman Tripathi, Yiling Huang, Ignacio Lopez Moreno, Hasim Sak

    Abstract: In this paper, we present a novel speaker diarization system for streaming on-device applications. In this system, we use a transformer transducer to detect the speaker turns, represent each speaker turn by a speaker embedding, then cluster these embeddings with constraints from the detected speaker turns. Compared with conventional clustering-based diarization systems, our system largely reduces… ▽ More

    Submitted 25 January, 2022; v1 submitted 23 September, 2021; originally announced September 2021.

  8. arXiv:2109.09664  [pdf, ps, other

    eess.SP

    Hybrid Transceiver Design for Tera-Hertz MIMO Systems Relying on Bayesian Learning Aided Sparse Channel Estimation

    Authors: Suraj Srivastava, Ajeet Tripathi, Neeraj Varshney, Aditya K. Jagannatham, Lajos Hanzo

    Abstract: Hybrid transceiver design in multiple-input multiple-output (MIMO) Tera-Hertz (THz) systems relying on sparse channel state information (CSI) estimation techniques is conceived. To begin with, a practical MIMO channel model is developed for the THz band that incorporates its molecular absorption and reflection losses, as well as its non-line-of-sight (NLoS) rays associated with its diffused compon… ▽ More

    Submitted 10 January, 2022; v1 submitted 20 September, 2021; originally announced September 2021.

  9. arXiv:2109.06346  [pdf, other

    eess.IV cs.CV cs.LG

    Physics Driven Domain Specific Transporter Framework with Attention Mechanism for Ultrasound Imaging

    Authors: Arpan Tripathi, Abhilash Rakkunedeth, Mahesh Raveendranatha Panicker, Jack Zhang, Naveenjyote Boora, Jessica Knight, Jacob Jaremko, Yale Tung Chen, Kiran Vishnu Narayan, Kesavadas C

    Abstract: Most applications of deep learning techniques in medical imaging are supervised and require a large number of labeled data which is expensive and requires many hours of careful annotation by experts. In this paper, we propose an unsupervised, physics driven domain specific transporter framework with an attention mechanism to identify relevant key points with applications in ultrasound imaging. The… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: 11 pages,18 figures(including supplementary material)

  10. arXiv:2106.06987  [pdf

    eess.IV cs.CV

    Learning the Imaging Landmarks: Unsupervised Key point Detection in Lung Ultrasound Videos

    Authors: Arpan Tripathi, Mahesh Raveendranatha Panicker, Abhilash R Hareendranathan, Yale Tung Chen, Jacob L Jaremko, Kiran Vishnu Narayan, Kesavadas C

    Abstract: Lung ultrasound (LUS) is an increasingly popular diagnostic imaging modality for continuous and periodic monitoring of lung infection, given its advantages of non-invasiveness, non-ionizing nature, portability and easy disinfection. The major landmarks assessed by clinicians for triaging using LUS are pleura, A and B lines. There have been many efforts for the automatic detection of these landmark… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

    Comments: 5 pages, 6 figures, submitted to IEEE EMBC 2021

  11. arXiv:2106.05929  [pdf, other

    eess.IV cs.LG

    Domain Specific Transporter Framework to Detect Fractures in Ultrasound

    Authors: Arpan Tripathi, Abhilash Rakkunedeth, Mahesh Raveendranatha Panicker, Jack Zhang, Naveenjyote Boora, Jacob Jaremko

    Abstract: Ultrasound examination for detecting fractures is ideally suited for Emergency Departments (ED) as it is relatively fast, safe (from ionizing radiation), has dynamic imaging capability and is easily portable. High interobserver variability in manual assessment of ultrasound scans has piqued research interest in automatic assessment techniques using Deep Learning (DL). Most DL techniques are superv… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

    Comments: 10 pages,3 figures

  12. arXiv:2105.05005  [pdf, other

    eess.AS cs.AI cs.CL cs.SD

    Reducing Streaming ASR Model Delay with Self Alignment

    Authors: Jaeyoung Kim, Han Lu, Anshuman Tripathi, Qian Zhang, Hasim Sak

    Abstract: Reducing prediction delay for streaming end-to-end ASR models with minimal performance regression is a challenging problem. Constrained alignment is a well-known existing approach that penalizes predicted word boundaries using external low-latency acoustic models. On the contrary, recently proposed FastEmit is a sequence-level delay regularization scheme encouraging vocabulary tokens over blanks w… ▽ More

    Submitted 6 May, 2021; originally announced May 2021.

    Comments: submitted to INTERSPEECH 2021

  13. arXiv:2103.06157  [pdf, other

    cs.SD cs.AI eess.AS

    Automatic Speaker Independent Dysarthric Speech Intelligibility Assessment System

    Authors: Ayush Tripathi, Swapnil Bhosale, Sunil Kumar Kopparapu

    Abstract: Dysarthria is a condition which hampers the ability of an individual to control the muscles that play a major role in speech delivery. The loss of fine control over muscles that assist the movement of lips, vocal chords, tongue and diaphragm results in abnormal speech delivery. One can assess the severity level of dysarthria by analyzing the intelligibility of speech spoken by an individual. Conti… ▽ More

    Submitted 10 March, 2021; originally announced March 2021.

    Comments: 29 pages, 2 figures, Computer Speech & Language 2021

  14. arXiv:2010.03192  [pdf, other

    cs.SD cs.LG eess.AS

    Transformer Transducer: One Model Unifying Streaming and Non-streaming Speech Recognition

    Authors: Anshuman Tripathi, Jaeyoung Kim, Qian Zhang, Han Lu, Hasim Sak

    Abstract: In this paper we present a Transformer-Transducer model architecture and a training technique to unify streaming and non-streaming speech recognition models into one model. The model is composed of a stack of transformer layers for audio encoding with no lookahead or right context and an additional stack of transformer layers on top trained with variable right context. In inference time, the conte… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

  15. arXiv:2009.04004  [pdf, other

    eess.IV cs.CV cs.LG

    Fuzzy Unique Image Transformation: Defense Against Adversarial Attacks On Deep COVID-19 Models

    Authors: Achyut Mani Tripathi, Ashish Mishra

    Abstract: Early identification of COVID-19 using a deep model trained on Chest X-Ray and CT images has gained considerable attention from researchers to speed up the process of identification of active COVID-19 cases. These deep models act as an aid to hospitals that suffer from the unavailability of specialists or radiologists, specifically in remote areas. Various deep models have been proposed to detect… ▽ More

    Submitted 8 September, 2020; originally announced September 2020.

  16. arXiv:2004.05698  [pdf, other

    eess.IV cs.CV cs.LG

    Y-net: Biomedical Image Segmentation and Clustering

    Authors: Sharmin Pathan, Anant Tripathi

    Abstract: We propose a deep clustering architecture alongside image segmentation for medical image analysis. The main idea is based on unsupervised learning to cluster images on severity of the disease in the subject's sample, and this image is then segmented to highlight and outline regions of interest. We start with training an autoencoder on the images for segmentation. The encoder part from the autoenco… ▽ More

    Submitted 26 May, 2020; v1 submitted 12 April, 2020; originally announced April 2020.

  17. arXiv:2002.02562  [pdf, other

    eess.AS cs.CL cs.SD

    Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss

    Authors: Qian Zhang, Han Lu, Hasim Sak, Anshuman Tripathi, Erik McDermott, Stephen Koo, Shankar Kumar

    Abstract: In this paper we present an end-to-end speech recognition model with Transformer encoders that can be used in a streaming speech recognition system. Transformer computation blocks based on self-attention are used to encode both audio and label sequences independently. The activations from both audio and label encoders are combined with a feed-forward layer to compute a probability distribution ove… ▽ More

    Submitted 14 February, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: This is the final version of the paper submitted to the ICASSP 2020 on Oct 21, 2019

  18. arXiv:1911.11360  [pdf, other

    eess.AS cs.SD eess.SP

    Robust Estimation of Hypernasality in Dysarthria with Acoustic Model Likelihood Features

    Authors: Michael Saxon, Ayush Tripathi, Yishan Jiao, Julie Liss, Visar Berisha

    Abstract: Hypernasality is a common characteristic symptom across many motor-speech disorders. For voiced sounds, hypernasality introduces an additional resonance in the lower frequencies and, for unvoiced sounds, there is reduced articulatory precision due to air escaping through the nasal cavity. However, the acoustic manifestation of these symptoms is highly variable, making hypernasality estimation very… ▽ More

    Submitted 5 August, 2020; v1 submitted 26 November, 2019; originally announced November 2019.

    Comments: 12 pages, 9 figures, 2 tables

    Journal ref: IEEE/ACM Trans. on Audio, Speech, and Language Proc. 28 (2020) 2511-2522

  19. arXiv:1812.11707  [pdf, ps, other

    cs.RO eess.SY

    UAV Control in Close Proximities - Ceiling Effect on Battery Lifetime

    Authors: Basaran Bahadir Kocer, Volkan Kumtepeli, Tegoeh Tjahjowidodo, Mahardhika Pratama, Anshuman Tripathi, Gerald Seet Gim Lee, Youyi Wang

    Abstract: With the recent developments in the unmanned aerial vehicles (UAV), it is expected them to interact and collaborate with their surrounding objects, other robots and people in order to wisely plan and execute particular tasks. Although these interaction operations are inherently challenging as compared to free-flight missions, they might bring diverse advantages. One of them is their basic aerodyna… ▽ More

    Submitted 31 December, 2018; originally announced December 2018.

    Comments: ICoIAS 2019

  20. arXiv:1808.05312  [pdf, other

    cs.CL eess.AS

    Toward domain-invariant speech recognition via large scale training

    Authors: Arun Narayanan, Ananya Misra, Khe Chai Sim, Golan Pundak, Anshuman Tripathi, Mohamed Elfeky, Parisa Haghani, Trevor Strohman, Michiel Bacchiani

    Abstract: Current state-of-the-art automatic speech recognition systems are trained to work in specific `domains', defined based on factors like application, sampling rate and codec. When such recognizers are used in conditions that do not match the training domain, performance significantly drops. This work explores the idea of building a single domain-invariant model for varied use-cases by combining larg… ▽ More

    Submitted 15 August, 2018; originally announced August 2018.

  21. arXiv:1805.08615  [pdf, other

    eess.AS cs.SD

    Adversarial Learning of Raw Speech Features for Domain Invariant Speech Recognition

    Authors: Aditay Tripathi, Aanchan Mohan, Saket Anand, Maneesh Singh

    Abstract: Recent advances in neural network based acoustic modelling have shown significant improvements in automatic speech recognition (ASR) performance. In order for acoustic models to be able to handle large acoustic variability, large amounts of labeled data is necessary, which are often expensive to obtain. This paper explores the application of adversarial training to learn features from raw speech t… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

    Comments: 5 pages, 1 figure, 2 tabels, ICASSP 2018

  22. arXiv:1711.07274  [pdf, ps, other

    cs.CL cs.SD eess.AS stat.ML

    Speech recognition for medical conversations

    Authors: Chung-Cheng Chiu, Anshuman Tripathi, Katherine Chou, Chris Co, Navdeep Jaitly, Diana Jaunzeikare, Anjuli Kannan, Patrick Nguyen, Hasim Sak, Ananth Sankar, Justin Tansuwan, Nathan Wan, Yonghui Wu, Xuedong Zhang

    Abstract: In this work we explored building automatic speech recognition models for transcribing doctor patient conversation. We collected a large scale dataset of clinical conversations ($14,000$ hr), designed the task to represent the real word scenario, and explored several alignment approaches to iteratively improve data quality. We explored both CTC and LAS systems for building speech recognition model… ▽ More

    Submitted 20 June, 2018; v1 submitted 20 November, 2017; originally announced November 2017.

    Comments: Interspeech 2018 camera ready