Multichannel Source Separation and Speech Enhancement Using the Convolutive Transfer Function

Li, Xiaofei; Girin, Laurent; Gannot, Sharon; Horaud, Radu

Computer Science > Sound

arXiv:1711.07911v1 (cs)

[Submitted on 21 Nov 2017 (this version), latest version 26 Feb 2018 (v2)]

Title:Multichannel Source Separation and Speech Enhancement Using the Convolutive Transfer Function

Authors:Xiaofei Li, Laurent Girin, Sharon Gannot, Radu Horaud

View PDF

Abstract:This paper addresses the problem of audio source recovery from multichannel noisy convolutive mixture for source separation and speech enhancement, assuming known mixing filters. We propose to conduct the source recovery in the short-time Fourier transform domain, and based on the convolutive transfer function (CTF) approximation. Compared to the time domain filters, CTF has much less taps, and thus less near-common zeros among channels and less computational complexity. This work proposes three source recovery methods, i) the multichannel inverse filtering method, i.e. multiple input/output inverse theorem (MINT), is exploited in the CTF domain, and for the multisource case, ii) a beamforming-like multichannel inverse filtering method is proposed appling the single source MINT and power minimization, which is suitable for the case that not the CTFs of all the sources are known, iii) a constrained Lasso method. The sources are recovered by minimizing their $\ell_1$-norm to impose the spectral sparsity, with the constraint that the $\ell_2$-norm fitting cost between the microphone signals and the mixture model involving the unknown source signals is less than a tolerance. The noise can be reduced by setting the tolerance to the noise power. Experiments under various acoustic conditions are conducted to evaluate the three proposed methods. The comparison among them and with the baseline methods are presented.

Comments:	13 pages, 5 figures
Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:1711.07911 [cs.SD]
	(or arXiv:1711.07911v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.1711.07911

Submission history

From: Radu Horaud P [view email]
[v1] Tue, 21 Nov 2017 17:02:03 UTC (354 KB)
[v2] Mon, 26 Feb 2018 10:00:40 UTC (355 KB)

Computer Science > Sound

Title:Multichannel Source Separation and Speech Enhancement Using the Convolutive Transfer Function

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Multichannel Source Separation and Speech Enhancement Using the Convolutive Transfer Function

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators