Skip to main content

Showing 1–16 of 16 results for author: Fels, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2407.18516  [pdf

    eess.AS eess.SY

    Integrating Posture Control in Speech Motor Models: A Parallel-Structured Simulation Approach

    Authors: Yadong Liu, Sidney Fels, Arian Shamei, Najeeb Khan, Bryan Gick

    Abstract: Posture is an essential aspect of motor behavior, necessitating continuous muscle activation to counteract gravity. It remains stable under perturbation, aiding in maintaining bodily balance and enabling movement execution. Similarities have been observed between gross body postures and speech postures, such as those involving the jaw, tongue, and lips, which also exhibit resilience to perturbatio… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures

  2. arXiv:2309.14586  [pdf, other

    cs.SD cs.AI cs.CV eess.AS eess.SP

    Speech Audio Synthesis from Tagged MRI and Non-Negative Matrix Factorization via Plastic Transformer

    Authors: Xiaofeng Liu, Fangxu Xing, Maureen Stone, Jiachen Zhuo, Sidney Fels, Jerry L. Prince, Georges El Fakhri, Jonghye Woo

    Abstract: The tongue's intricate 3D structure, comprising localized functional units, plays a crucial role in the production of speech. When measured using tagged MRI, these functional units exhibit cohesive displacements and derived quantities that facilitate the complex process of speech production. Non-negative matrix factorization-based approaches have been shown to estimate the functional units through… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: MICCAI 2023 (Oral presentation)

  3. arXiv:2102.04588  [pdf, other

    cs.SD cs.CL eess.AS

    A comparative study of two-dimensional vocal tract acoustic modeling based on Finite-Difference Time-Domain methods

    Authors: Debasish Ray Mohapatra, Victor Zappi, Sidney Fels

    Abstract: The two-dimensional (2D) numerical approaches for vocal tract (VT) modelling can afford a better balance between the low computational cost and accurate rendering of acoustic wave propagation. However, they require a high spatio-temporal resolution in the numerical scheme for a precise estimation of acoustic formants at the simulation run-time expense. We have recently proposed a new VT acoustic m… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Comments: 4 pages, 3 figures

  4. arXiv:2102.01640  [pdf, other

    cs.SD cs.CL eess.AS

    SPEAK WITH YOUR HANDS Using Continuous Hand Gestures to control Articulatory Speech Synthesizer

    Authors: Pramit Saha, Debasish Ray Mohapatra, Sidney Fels

    Abstract: This work presents our advancements in controlling an articulatory speech synthesis engine, \textit{viz.}, Pink Trombone, with hand gestures. Our interface translates continuous finger movements and wrist flexion into continuous speech using vocal tract area-function based articulatory speech synthesis. We use Cyberglove II with 18 sensors to capture the kinematic information of the wrist and the… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

    Comments: 2 pages, 1 figure

  5. arXiv:2010.14228  [pdf

    cs.HC cs.SD eess.AS

    New interfaces for musical expression

    Authors: Ivan Poupyrev, Michael J. Lyons, Sidney Fels, Tina Blaine

    Abstract: The rapid evolution of electronics, digital media, advanced materials, and other areas of technology, is opening up unprecedented opportunities for musical interface inventors and designers. The possibilities afforded by these new technologies carry with them the challenges of a complex and often confusing array of choices for musical composers and performers. New musical technologies are at least… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: 2 pages, This item describes the CHI'01 workshop which started the International Conference on New Interfaces for Musical Expression

    ACM Class: H.5.5

    Journal ref: ACM CHI'01 Extended Abstracts on Human Factors in Computing Systems, March 2001 Pages 491-492

  6. arXiv:2006.16367  [pdf, other

    eess.IV cs.LG cs.SD eess.AS stat.ML

    Ultra2Speech -- A Deep Learning Framework for Formant Frequency Estimation and Tracking from Ultrasound Tongue Images

    Authors: Pramit Saha, Yadong Liu, Bryan Gick, Sidney Fels

    Abstract: Thousands of individuals need surgical removal of their larynx due to critical diseases every year and therefore, require an alternative form of communication to articulate speech sounds after the loss of their voice box. This work addresses the articulatory-to-acoustic mapping problem based on ultrasound (US) tongue images for the development of a silent-speech interface (SSI) that can provide th… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: Accepted for publication in MICCAI 2020

  7. arXiv:2005.09463  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Learning Joint Articulatory-Acoustic Representations with Normalizing Flows

    Authors: Pramit Saha, Sidney Fels

    Abstract: The articulatory geometric configurations of the vocal tract and the acoustic properties of the resultant speech sound are considered to have a strong causal relationship. This paper aims at finding a joint latent representation between the articulatory and acoustic domain for vowel sounds via invertible neural network models, while simultaneously preserving the respective domain-specific features… ▽ More

    Submitted 30 September, 2020; v1 submitted 16 May, 2020; originally announced May 2020.

    Comments: 5 pages, 4 figures, accepted for publication in Interspeech 2020

  8. arXiv:1912.03120  [pdf, other

    eess.IV cs.LG stat.ML

    A Study into Echocardiography View Conversion

    Authors: Amir H. Abdi, Mohammad H. Jafari, Sidney Fels, Theresa Tsang, Purang Abolmaesumi

    Abstract: Transthoracic echo is one of the most common means of cardiac studies in the clinical routines. During the echo exam, the sonographer captures a set of standard cross sections (echo views) of the heart. Each 2D echo view cuts through the 3D cardiac geometry via a unique plane. Consequently, different views share some limited information. In this work, we investigate the feasibility of generating a… ▽ More

    Submitted 5 December, 2019; originally announced December 2019.

    Comments: Workshop of Medical Imaging Meets NeurIPS, NeurIPS 2019

  9. arXiv:1910.13859  [pdf, other

    eess.SP

    Reinforcement Learning for High-dimensional Continuous Control in Biomechanics: An Intro to ArtiSynth-RL

    Authors: Amir H. Abdi, Masoud Malakoutian, Thomas Oxland, Sidney Fels

    Abstract: Neural control is an exciting mystery which we instinctively master. Yet, researchers have a hard time explaining the motor control trajectories. Physiologically accurate biomechanical simulations can, to some extent, mimic live subjects and help us form evidence-based hypotheses. In these simulated environments, muscle excitations are typically calculated through inverse dynamic optimizations whi… ▽ More

    Submitted 9 December, 2019; v1 submitted 25 October, 2019; originally announced October 2019.

    Comments: Deep Reinforcement Learning Workshop NeurIPS 2019

  10. An extended two-dimensional vocal tract model for fast acoustic simulation of single-axis symmetric three-dimensional tubes

    Authors: Debasish Ray Mohapatra, Victor Zappi, Sidney Fels

    Abstract: The simulation of two-dimensional (2D) wave propagation is an affordable computational task and its use can potentially improve time performance in vocal tracts' acoustic analysis. Several models have been designed that rely on 2D wave solvers and include 2D representations of three-dimensional (3D) vocal tract-like geometries. However, until now, only the acoustics of straight 3D tubes with circu… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

    Comments: 5 pages, 2 figures, Interspeech 2019 submission

  11. arXiv:1904.05746  [pdf, other

    cs.LG cs.CL cs.SD eess.AS stat.ML

    SPEAK YOUR MIND! Towards Imagined Speech Recognition With Hierarchical Deep Learning

    Authors: Pramit Saha, Muhammad Abdul-Mageed, Sidney Fels

    Abstract: Speech-related Brain Computer Interface (BCI) technologies provide effective vocal communication strategies for controlling devices through speech commands interpreted from brain signals. In order to infer imagined speech from active thoughts, we propose a novel hierarchical deep learning BCI system for subject-independent classification of 11 speech tokens including phonemes and words. Our novel… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Comments: Under review in INTERSPEECH 2019. arXiv admin note: text overlap with arXiv:1904.04358

  12. arXiv:1904.04358  [pdf, other

    cs.LG cs.CL cs.SD eess.AS stat.ML

    Deep Learning the EEG Manifold for Phonological Categorization from Active Thoughts

    Authors: Pramit Saha, Muhammad Abdul-Mageed, Sidney Fels

    Abstract: Speech-related Brain Computer Interfaces (BCI) aim primarily at finding an alternative vocal communication pathway for people with speaking disabilities. As a step towards full decoding of imagined speech from active thoughts, we present a BCI system for subject-independent classification of phonological categories exploiting a novel deep learning based hierarchical feature extraction scheme. To b… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Comments: Accepted for publication in IEEE ICASSP 2019

  13. arXiv:1904.04352  [pdf, other

    cs.LG eess.IV stat.ML

    Hierarchical Deep Feature Learning For Decoding Imagined Speech From EEG

    Authors: Pramit Saha, Sidney Fels

    Abstract: We propose a mixed deep neural network strategy, incorporating parallel combination of Convolutional (CNN) and Recurrent Neural Networks (RNN), cascaded with deep autoencoders and fully connected layers towards automatic identification of imagined speech from EEG. Instead of utilizing raw EEG channel data, we compute the joint variability of the channels in the form of a covariance matrix that pro… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Comments: Accepted in AAAI 2019 under Student Abstract and Poster Program

  14. arXiv:1811.08029  [pdf, other

    cs.SD eess.AS

    Sound-Stream II: Towards Real-Time Gesture Controlled Articulatory Sound Synthesis

    Authors: Pramit Saha, Debasish Ray Mohapatra, Praneeth SV, Sidney Fels

    Abstract: We present an interface involving four degrees-of-freedom (DOF) mechanical control of a two dimensional, mid-sagittal tongue through a biomechanical toolkit called ArtiSynth and a sound synthesis engine called JASS towards articulatory sound synthesis. As a demonstration of the project, the user will learn to produce a range of JASS vocal sounds, by varying the shape and position of the ArtiSynth… ▽ More

    Submitted 19 November, 2018; originally announced November 2018.

  15. arXiv:1811.07435  [pdf, other

    cs.SD cs.CL eess.AS

    Limitations of Source-Filter Coupling In Phonation

    Authors: Debasish Ray Mohapatra, Sidney Fels

    Abstract: The coupling of vocal fold (source) and vocal tract (filter) is one of the most critical factors in source-filter articulation theory. The traditional linear source-filter theory has been challenged by current research which clearly shows the impact of acoustic loading on the dynamic behavior of the vocal fold vibration as well as the variations in the glottal flow pulses shape. This paper outline… ▽ More

    Submitted 18 November, 2018; originally announced November 2018.

    Comments: 2 pages, 2 figures

  16. arXiv:1807.11089  [pdf, other

    cs.SD cs.CL cs.CV cs.LG eess.AS

    Towards Automatic Speech Identification from Vocal Tract Shape Dynamics in Real-time MRI

    Authors: Pramit Saha, Praneeth Srungarapu, Sidney Fels

    Abstract: Vocal tract configurations play a vital role in generating distinguishable speech sounds, by modulating the airflow and creating different resonant cavities in speech production. They contain abundant information that can be utilized to better understand the underlying speech production mechanism. As a step towards automatic mapping of vocal tract shape geometry to acoustics, this paper employs ef… ▽ More

    Submitted 29 July, 2018; originally announced July 2018.

    Comments: To appear in the INTERSPEECH 2018 Proceedings