-
Tracking Articulatory Dynamics in Speech with a Fixed-Weight BiLSTM-CNN Architecture
Authors:
Leena G Pillai,
D. Muhammad Noorul Mubarak,
Elizabeth Sherly
Abstract:
Speech production is a complex sequential process which involve the coordination of various articulatory features. Among them tongue being a highly versatile active articulator responsible for shaping airflow to produce targeted speech sounds that are intellectual, clear, and distinct. This paper presents a novel approach for predicting tongue and lip articulatory features involved in a given spee…
▽ More
Speech production is a complex sequential process which involve the coordination of various articulatory features. Among them tongue being a highly versatile active articulator responsible for shaping airflow to produce targeted speech sounds that are intellectual, clear, and distinct. This paper presents a novel approach for predicting tongue and lip articulatory features involved in a given speech acoustics using a stacked Bidirectional Long Short-Term Memory (BiLSTM) architecture, combined with a one-dimensional Convolutional Neural Network (CNN) for post-processing with fixed weights initialization. The proposed network is trained with two datasets consisting of simultaneously recorded speech and Electromagnetic Articulography (EMA) datasets, each introducing variations in terms of geographical origin, linguistic characteristics, phonetic diversity, and recording equipment. The performance of the model is assessed in Speaker Dependent (SD), Speaker Independent (SI), corpus dependent (CD) and cross corpus (CC) modes. Experimental results indicate that the proposed model with fixed weights approach outperformed the adaptive weights initialization with in relatively minimal number of training epochs. These findings contribute to the development of robust and efficient models for articulatory feature prediction, paving the way for advancements in speech production research and applications.
△ Less
Submitted 25 April, 2025;
originally announced April 2025.
-
Acoustic to Articulatory Inversion of Speech; Data Driven Approaches, Challenges, Applications, and Future Scope
Authors:
Leena G Pillai,
D. Muhammad Noorul Mubarak
Abstract:
This review is focused on the data-driven approaches applied in different applications of Acoustic-to-Articulatory Inversion (AAI) of speech. This review paper considered the relevant works published in the last ten years (2011-2021). The selection criteria includes (a) type of AAI - Speaker Dependent and Speaker Independent AAI, (b) objectives of the work - Articulatory approximation, Articulator…
▽ More
This review is focused on the data-driven approaches applied in different applications of Acoustic-to-Articulatory Inversion (AAI) of speech. This review paper considered the relevant works published in the last ten years (2011-2021). The selection criteria includes (a) type of AAI - Speaker Dependent and Speaker Independent AAI, (b) objectives of the work - Articulatory approximation, Articulatory Feature space selection and Automatic Speech Recognition (ASR), explore the correlation between acoustic and articulatory features, and framework for Computer-assisted language training, (c) Corpus - Simultaneously recorded speech (wav) and medical imaging models such as ElectroMagnetic Articulography (EMA), Electropalatography (EPG), Laryngography, Electroglottography (EGG), X-ray Cineradiography, Ultrasound, and real-time Magnetic Resonance Imaging (rtMRI), (d) Methods or models - recent works are considered, and therefore all the works are based on machine learning, (e) Evaluation - as AAI is a non-linear regression problem, the performance evaluation is mostly done by Correlation Coefficient (CC), Root Mean Square Error (RMSE), and also considered Mean Square Error (MSE), and Mean Format Error (MFE). The practical application of the AAI model can provide a better and user-friendly interpretable image feedback system of articulatory positions, especially tongue movement. Such trajectory feedback system can be used to provide phonetic, language, and speech therapy for pathological subjects.
△ Less
Submitted 17 April, 2025;
originally announced April 2025.
-
Modeling and Analysis of Application Interference on Dragonfly+
Authors:
Yao Kang,
Xin Wang,
Neil McGlohon,
Misbah Mubarak,
Sudheer Chunduri,
Zhiling Lan
Abstract:
Dragonfly class of networks are considered as promising interconnects for next-generation supercomputers. While Dragonfly+ networks offer more path diversity than the original Dragonfly design, they are still prone to performance variability due to their hierarchical architecture and resource sharing design. Event-driven network simulators are indispensable tools for navigating complex system desi…
▽ More
Dragonfly class of networks are considered as promising interconnects for next-generation supercomputers. While Dragonfly+ networks offer more path diversity than the original Dragonfly design, they are still prone to performance variability due to their hierarchical architecture and resource sharing design. Event-driven network simulators are indispensable tools for navigating complex system design. In this study, we quantitatively evaluate a variety of application communication interactions on a 3,456-node Dragonfly+ system by using the CODES toolkit. This study looks at the impact of communication interference from a user's perspective. Specifically, for a given application submitted by a user, we examine how this application will behave with the existing workload running in the system under different job placement policies. Our simulation study considers hundreds of experiment configurations including four target applications with representative communication patterns under a variety of network traffic conditions. Our study shows that intra-job interference can cause severe performance degradation for communication-intensive applications. Inter-job interference can generally be reduced for applications with one-to-one or one-to-many communication patterns through job isolation. Application with one-to-all communication pattern is resilient to network interference.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Union: An Automatic Workload Manager for Accelerating Network Simulation
Authors:
Xin Wang,
Misbah Mubarak,
Yao Kang,
Robert B. Ross,
Zhiling Lan
Abstract:
With the rapid growth of the machine learning applications, the workloads of future HPC systems are anticipated to be a mix of scientific simulation, big data analytics, and machine learning applications. Simulation is a great research vehicle to understand the performance implications of co-running scientific applications with big data and machine learning workloads on large-scale systems. In thi…
▽ More
With the rapid growth of the machine learning applications, the workloads of future HPC systems are anticipated to be a mix of scientific simulation, big data analytics, and machine learning applications. Simulation is a great research vehicle to understand the performance implications of co-running scientific applications with big data and machine learning workloads on large-scale systems. In this paper, we present Union, a workload manager that provides an automatic framework to facilitate hybrid workload simulation in CODES. Furthermore, we use Union, along with CODES, to investigate various hybrid workloads composed of traditional simulation applications and emerging learning applications on two dragonfly systems. The experiment results show that both message latency and communication time are important performance metrics to evaluate network interference. Network interference on HPC applications is more reflected by the message latency variation, whereas ML application performance depends more on the communication time.
△ Less
Submitted 3 April, 2024; v1 submitted 24 March, 2024;
originally announced March 2024.
-
A Visual Analytics Framework for Reviewing Streaming Performance Data
Authors:
Suraj P. Kesavan,
Takanori Fujiwara,
Jianping Kelvin Li,
Caitlin Ross,
Misbah Mubarak,
Christopher D. Carothers,
Robert B. Ross,
Kwan-Liu Ma
Abstract:
Understanding and tuning the performance of extreme-scale parallel computing systems demands a streaming approach due to the computational cost of applying offline algorithms to vast amounts of performance log data. Analyzing large streaming data is challenging because the rate of receiving data and limited time to comprehend data make it difficult for the analysts to sufficiently examine the data…
▽ More
Understanding and tuning the performance of extreme-scale parallel computing systems demands a streaming approach due to the computational cost of applying offline algorithms to vast amounts of performance log data. Analyzing large streaming data is challenging because the rate of receiving data and limited time to comprehend data make it difficult for the analysts to sufficiently examine the data without missing important changes or patterns. To support streaming data analysis, we introduce a visual analytic framework comprising of three modules: data management, analysis, and interactive visualization. The data management module collects various computing and communication performance metrics from the monitored system using streaming data processing techniques and feeds the data to the other two modules. The analysis module automatically identifies important changes and patterns at the required latency. In particular, we introduce a set of online and progressive analysis methods for not only controlling the computational costs but also helping analysts better follow the critical aspects of the analysis results. Finally, the interactive visualization module provides the analysts with a coherent view of the changes and patterns in the continuously captured performance data. Through a multi-faceted case study on performance analysis of parallel discrete-event simulation, we demonstrate the effectiveness of our framework for identifying bottlenecks and locating outliers.
△ Less
Submitted 25 January, 2020;
originally announced January 2020.