AI-based soundscape analysis: Jointly identifying sound sources and predicting annoyance

Hou, Yuanbo; Ren, Qiaoqiao; Zhang, Huizhong; Mitchell, Andrew; Aletta, Francesco; Kang, Jian; Botteldooren, Dick

doi:10.1121/10.0022408

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:2311.09030 (eess)

[Submitted on 15 Nov 2023]

Title:AI-based soundscape analysis: Jointly identifying sound sources and predicting annoyance

Authors:Yuanbo Hou, Qiaoqiao Ren, Huizhong Zhang, Andrew Mitchell, Francesco Aletta, Jian Kang, Dick Botteldooren

View PDF

Abstract:Soundscape studies typically attempt to capture the perception and understanding of sonic environments by surveying users. However, for long-term monitoring or assessing interventions, sound-signal-based approaches are required. To this end, most previous research focused on psycho-acoustic quantities or automatic sound recognition. Few attempts were made to include appraisal (e.g., in circumplex frameworks). This paper proposes an artificial intelligence (AI)-based dual-branch convolutional neural network with cross-attention-based fusion (DCNN-CaF) to analyze automatic soundscape characterization, including sound recognition and appraisal. Using the DeLTA dataset containing human-annotated sound source labels and perceived annoyance, the DCNN-CaF is proposed to perform sound source classification (SSC) and human-perceived annoyance rating prediction (ARP). Experimental findings indicate that (1) the proposed DCNN-CaF using loudness and Mel features outperforms the DCNN-CaF using only one of them. (2) The proposed DCNN-CaF with cross-attention fusion outperforms other typical AI-based models and soundscape-related traditional machine learning methods on the SSC and ARP tasks. (3) Correlation analysis reveals that the relationship between sound sources and annoyance is similar for humans and the proposed AI-based DCNN-CaF model. (4) Generalization tests show that the proposed model's ARP in the presence of model-unknown sound sources is consistent with expert expectations and can explain previous findings from the literature on sound-scape augmentation.

Comments:	The Journal of the Acoustical Society of America, 154 (5), 3145
Subjects:	Audio and Speech Processing (eess.AS); Sound (cs.SD)
Cite as:	arXiv:2311.09030 [eess.AS]
	(or arXiv:2311.09030v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.2311.09030
Journal reference:	The Journal of the Acoustical Society of America, 154, 3145 (2023)
Related DOI:	https://doi.org/10.1121/10.0022408

Submission history

From: Yuanbo Hou [view email]
[v1] Wed, 15 Nov 2023 15:23:33 UTC (2,134 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:AI-based soundscape analysis: Jointly identifying sound sources and predicting annoyance

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:AI-based soundscape analysis: Jointly identifying sound sources and predicting annoyance

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators