Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment

Boeddeker, Christoph; Cord-Landwehr, Tobias; Haeb-Umbach, Reinhold

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2406.03155 (eess)

[Submitted on 5 Jun 2024]

Title:Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment

Authors:Christoph Boeddeker, Tobias Cord-Landwehr, Reinhold Haeb-Umbach

View PDF

Abstract:Diarization is a crucial component in meeting transcription systems to ease the challenges of speech enhancement and attribute the transcriptions to the correct speaker. Particularly in the presence of overlapping or noisy speech, these systems have problems reliably assigning the correct speaker labels, leading to a significant amount of speaker confusion errors. We propose to add segment-level speaker reassignment to address this issue. By revisiting, after speech enhancement, the speaker attribution for each segment, speaker confusion errors from the initial diarization stage are significantly reduced. Through experiments across different system configurations and datasets, we further demonstrate the effectiveness and applicability in various domains. Our results show that segment-level speaker reassignment successfully rectifies at least 40% of speaker confusion word errors, highlighting its potential for enhancing diarization accuracy in meeting transcription systems.

Comments:	Accepted for Interspeech 2024
Subjects:	Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2406.03155 [eess.AS]
	(or arXiv:2406.03155v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2406.03155

Submission history

From: Christoph Boeddeker [view email]
[v1] Wed, 5 Jun 2024 11:32:13 UTC (150 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators