Sound Event Detection Guided by Semantic Contexts of Scenes

Tonami, Noriyuki; Imoto, Keisuke; Nagase, Ryotaro; Okamoto, Yuki; Fukumori, Takahiro; Yamashita, Yoichi

Computer Science > Sound

arXiv:2110.03243 (cs)

[Submitted on 7 Oct 2021 (v1), last revised 17 Feb 2022 (this version, v3)]

Title:Sound Event Detection Guided by Semantic Contexts of Scenes

Authors:Noriyuki Tonami, Keisuke Imoto, Ryotaro Nagase, Yuki Okamoto, Takahiro Fukumori, Yoichi Yamashita

View PDF

Abstract:Some studies have revealed that contexts of scenes (e.g., "home," "office," and "cooking") are advantageous for sound event detection (SED). Mobile devices and sensing technologies give useful information on scenes for SED without the use of acoustic signals. However, conventional methods can employ pre-defined contexts in inference stages but not undefined contexts. This is because one-hot representations of pre-defined scenes are exploited as prior contexts for such conventional methods. To alleviate this problem, we propose scene-informed SED where pre-defined scene-agnostic contexts are available for more accurate SED. In the proposed method, pre-trained large-scale language models are utilized, which enables SED models to employ unseen semantic contexts of scenes in inference stages. Moreover, we investigated the extent to which the semantic representation of scene contexts is useful for SED. Experimental results performed with TUT Sound Events 2016/2017 and TUT Acoustic Scenes 2016/2017 datasets show that the proposed method improves micro and macro F-scores by 4.34 and 3.13 percentage points compared with conventional Conformer- and CNN--BiGRU-based SED, respectively.

Comments:	Accepted to ICASSP 2022
Subjects:	Sound (cs.SD)
Cite as:	arXiv:2110.03243 [cs.SD]
	(or arXiv:2110.03243v3 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2110.03243

Submission history

From: Noriyuki Tonami [view email]
[v1] Thu, 7 Oct 2021 07:55:03 UTC (825 KB)
[v2] Fri, 4 Feb 2022 08:35:20 UTC (825 KB)
[v3] Thu, 17 Feb 2022 06:03:02 UTC (834 KB)

Computer Science > Sound

Title:Sound Event Detection Guided by Semantic Contexts of Scenes

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Sound Event Detection Guided by Semantic Contexts of Scenes

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators