Filler Word Detection and Classification: A Dataset and Benchmark

Zhu, Ge; Caceres, Juan-Pablo; Salamon, Justin

Computer Science > Computation and Language

arXiv:2203.15135 (cs)

[Submitted on 28 Mar 2022 (v1), last revised 2 Jul 2022 (this version, v2)]

Title:Filler Word Detection and Classification: A Dataset and Benchmark

Authors:Ge Zhu, Juan-Pablo Caceres, Justin Salamon

View PDF

Abstract:Filler words such as `uh' or `um' are sounds or words people use to signal they are pausing to think. Finding and removing filler words from recordings is a common and tedious task in media editing. Automatically detecting and classifying filler words could greatly aid in this task, but few studies have been published on this problem to date. A key reason is the absence of a dataset with annotated filler words for model training and evaluation. In this work, we present a novel speech dataset, PodcastFillers, with 35K annotated filler words and 50K annotations of other sounds that commonly occur in podcasts such as breaths, laughter, and word repetitions. We propose a pipeline that leverages VAD and ASR to detect filler candidates and a classifier to distinguish between filler word types. We evaluate our proposed pipeline on PodcastFillers, compare to several baselines, and present a detailed ablation study. In particular, we evaluate the importance of using ASR and how it compares to a transcription-free approach resembling keyword spotting. We show that our pipeline obtains state-of-the-art results, and that leveraging ASR strongly outperforms a keyword spotting approach. We make PodcastFillers publicly available, in the hope that our work serves as a benchmark for future research.

Comments:	To appear at Insterspeech 2022
Subjects:	Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2203.15135 [cs.CL]
	(or arXiv:2203.15135v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2203.15135

Submission history

From: Ge Zhu [view email]
[v1] Mon, 28 Mar 2022 22:53:54 UTC (736 KB)
[v2] Sat, 2 Jul 2022 00:34:13 UTC (4,007 KB)

Computer Science > Computation and Language

Title:Filler Word Detection and Classification: A Dataset and Benchmark

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Filler Word Detection and Classification: A Dataset and Benchmark

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators