Speech Representation Analysis based on Inter- and Intra-Model Similarities

Kheir, Yassine El; Ali, Ahmed; Chowdhury, Shammur Absar

Computer Science > Sound

arXiv:2406.16099 (cs)

[Submitted on 23 Jun 2024]

Title:Speech Representation Analysis based on Inter- and Intra-Model Similarities

Authors:Yassine El Kheir, Ahmed Ali, Shammur Absar Chowdhury

View PDF HTML (experimental)

Abstract:Self-supervised models have revolutionized speech processing, achieving new levels of performance in a wide variety of tasks with limited resources. However, the inner workings of these models are still opaque. In this paper, we aim to analyze the encoded contextual representation of these foundation models based on their inter- and intra-model similarity, independent of any external annotation and task-specific constraint. We examine different SSL models varying their training paradigm -- Contrastive (Wav2Vec2.0) and Predictive models (HuBERT); and model sizes (base and large). We explore these models on different levels of localization/distributivity of information including (i) individual neurons; (ii) layer representation; (iii) attention weights and (iv) compare the representations with their finetuned this http URL results highlight that these models converge to similar representation subspaces but not to similar neuron-localized concepts\footnote{A concept represents a coherent fragment of knowledge, such as ``a class containing certain objects as elements, where the objects have certain properties. We made the code publicly available for facilitating further research, we publicly released our code.

Comments:	5 pages, Accepted to appear in ICASSP XAI-SA Workshop
Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2406.16099 [cs.SD]
	(or arXiv:2406.16099v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2406.16099

Submission history

From: Yassine El Kheir [view email]
[v1] Sun, 23 Jun 2024 13:00:03 UTC (7,525 KB)

Computer Science > Sound

Title:Speech Representation Analysis based on Inter- and Intra-Model Similarities

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Speech Representation Analysis based on Inter- and Intra-Model Similarities

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators