Automatic classification of stop realisation with wav2vec2.0

Tanner, James; Sonderegger, Morgan; Stuart-Smith, Jane; Mielke, Jeff; Kendall, Tyler

Computer Science > Computation and Language

arXiv:2505.23688 (cs)

[Submitted on 29 May 2025 (v1), last revised 30 May 2025 (this version, v2)]

Title:Automatic classification of stop realisation with wav2vec2.0

Authors:James Tanner, Morgan Sonderegger, Jane Stuart-Smith, Jeff Mielke, Tyler Kendall

View PDF HTML (experimental)

Abstract:Modern phonetic research regularly makes use of automatic tools for the annotation of speech data, however few tools exist for the annotation of many variable phonetic phenomena. At the same time, pre-trained self-supervised models, such as wav2vec2.0, have been shown to perform well at speech classification tasks and latently encode fine-grained phonetic information. We demonstrate that wav2vec2.0 models can be trained to automatically classify stop burst presence with high accuracy in both English and Japanese, robust across both finely-curated and unprepared speech corpora. Patterns of variability in stop realisation are replicated with the automatic annotations, and closely follow those of manual annotations. These results demonstrate the potential of pre-trained speech models as tools for the automatic annotation and processing of speech corpus data, enabling researchers to 'scale-up' the scope of phonetic research with relative ease.

Comments:	Accepted for Interspeech 2025. 5 pages, 3 figures
Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2505.23688 [cs.CL]
	(or arXiv:2505.23688v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2505.23688

Submission history

From: James Tanner [view email]
[v1] Thu, 29 May 2025 17:25:35 UTC (5,638 KB)
[v2] Fri, 30 May 2025 03:54:35 UTC (5,638 KB)

Computer Science > Computation and Language

Title:Automatic classification of stop realisation with wav2vec2.0

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Automatic classification of stop realisation with wav2vec2.0

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators