Robust Cross-Etiology and Speaker-Independent Dysarthric Speech Recognition

Singh, Satwinder; Wang, Qianli; Zhong, Zihan; Mendes, Clarion; Hasegawa-Johnson, Mark; Abdulla, Waleed; Shahamiri, Seyed Reza

Computer Science > Sound

arXiv:2501.14994 (cs)

[Submitted on 25 Jan 2025]

Title:Robust Cross-Etiology and Speaker-Independent Dysarthric Speech Recognition

Authors:Satwinder Singh, Qianli Wang, Zihan Zhong, Clarion Mendes, Mark Hasegawa-Johnson, Waleed Abdulla, Seyed Reza Shahamiri

View PDF HTML (experimental)

Abstract:In this paper, we present a speaker-independent dysarthric speech recognition system, with a focus on evaluating the recently released Speech Accessibility Project (SAP-1005) dataset, which includes speech data from individuals with Parkinson's disease (PD). Despite the growing body of research in dysarthric speech recognition, many existing systems are speaker-dependent and adaptive, limiting their generalizability across different speakers and etiologies. Our primary objective is to develop a robust speaker-independent model capable of accurately recognizing dysarthric speech, irrespective of the speaker. Additionally, as a secondary objective, we aim to test the cross-etiology performance of our model by evaluating it on the TORGO dataset, which contains speech samples from individuals with cerebral palsy (CP) and amyotrophic lateral sclerosis (ALS). By leveraging the Whisper model, our speaker-independent system achieved a CER of 6.99% and a WER of 10.71% on the SAP-1005 dataset. Further, in cross-etiology settings, we achieved a CER of 25.08% and a WER of 39.56% on the TORGO dataset. These results highlight the potential of our approach to generalize across unseen speakers and different etiologies of dysarthria.

Comments:	Accepted to ICASSP 2025
Subjects:	Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2501.14994 [cs.SD]
	(or arXiv:2501.14994v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2501.14994

Submission history

From: Satwinder Singh PhD [view email]
[v1] Sat, 25 Jan 2025 00:02:58 UTC (1,316 KB)

Computer Science > Sound

Title:Robust Cross-Etiology and Speaker-Independent Dysarthric Speech Recognition

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Robust Cross-Etiology and Speaker-Independent Dysarthric Speech Recognition

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators