Evaluating pretrained speech embedding systems for dysarthria detection across heterogenous datasets

Wihlborg, Lovisa; Goodall, Jemima; Wheatley, David; Webber, Jacob J.; Tam, Johnny; Weaver, Christine; Pal, Suvankar; Chandran, Siddharthan; Seth, Sohan; Watts, Oliver; Valentini-Botinhao, Cassia

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2509.19946 (eess)

[Submitted on 24 Sep 2025]

Title:Evaluating pretrained speech embedding systems for dysarthria detection across heterogenous datasets

Authors:Lovisa Wihlborg, Jemima Goodall, David Wheatley, Jacob J. Webber, Johnny Tam, Christine Weaver, Suvankar Pal, Siddharthan Chandran, Sohan Seth, Oliver Watts, Cassia Valentini-Botinhao

View PDF HTML (experimental)

Abstract:We present a comprehensive evaluation of pretrained speech embedding systems for the detection of dysarthric speech using existing accessible data. Dysarthric speech datasets are often small and can suffer from recording biases as well as data imbalance. To address these we selected a range of datasets covering related conditions and adopt the use of several cross-validations runs to estimate the chance level. To certify that results are above chance, we compare the distribution of scores across these runs against the distribution of scores of a carefully crafted null hypothesis. In this manner, we evaluate 17 publicly available speech embedding systems across 6 different datasets, reporting the cross-validation performance on each. We also report cross-dataset results derived when training with one particular dataset and testing with another. We observed that within-dataset results vary considerably depending on the dataset, regardless of the embedding used, raising questions about which datasets should be used for benchmarking. We found that cross-dataset accuracy is, as expected, lower than within-dataset, highlighting challenges in the generalization of the systems. These findings have important implications for the clinical validity of systems trained and tested on the same dataset.

Comments:	Submitted to ICASSP 2026. This work is supported by NEURii, a collaborative partnership involving the University of Edinburgh, Gates Ventures, Eisai, LifeArc and Health Data Research UK (HDR UK)
Subjects:	Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2509.19946 [eess.AS]
	(or arXiv:2509.19946v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2509.19946

Submission history

From: Jacob Josiah Webber [view email]
[v1] Wed, 24 Sep 2025 09:56:11 UTC (4,794 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Evaluating pretrained speech embedding systems for dysarthria detection across heterogenous datasets

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Evaluating pretrained speech embedding systems for dysarthria detection across heterogenous datasets

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators