Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates

Vera-Diaz, Juan Manuel; Pizarro, Daniel; Macias-Guarasa, Javier

doi:10.3390/s18103418

Computer Science > Sound

arXiv:1807.11094 (cs)

[Submitted on 29 Jul 2018]

Title:Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates

Authors:Juan Manuel Vera-Diaz, Daniel Pizarro, Javier Macias-Guarasa

View PDF

Abstract:This paper presents a novel approach for indoor acoustic source localization using microphone arrays and based on a Convolutional Neural Network (CNN). The proposed solution is, to the best of our knowledge, the first published work in which the CNN is designed to directly estimate the three dimensional position of an acoustic source, using the raw audio signal as the input information avoiding the use of hand crafted audio features. Given the limited amount of available localization data, we propose in this paper a training strategy based on two steps. We first train our network using semi-synthetic data, generated from close talk speech recordings, and where we simulate the time delays and distortion suffered in the signal that propagates from the source to the array of microphones. We then fine tune this network using a small amount of real data. Our experimental results show that this strategy is able to produce networks that significantly improve existing localization methods based on \textit{SRP-PHAT} strategies. In addition, our experiments show that our CNN method exhibits better resistance against varying gender of the speaker and different window sizes compared with the other methods.

Comments:	18 pages, 3 figures, 8 tables
Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:1807.11094 [cs.SD]
	(or arXiv:1807.11094v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.1807.11094
Journal reference:	Sensors 2018, (volume 18(10), 3418)
Related DOI:	https://doi.org/10.3390/s18103418

Submission history

From: Javier Macias-Guarasa [view email]
[v1] Sun, 29 Jul 2018 18:22:38 UTC (1,469 KB)

Computer Science > Sound

Title:Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators