-
Blind Estimation of Sub-band Acoustic Parameters from Ambisonics Recordings using Spectro-Spatial Covariance Features
Authors:
Hanyu Meng,
Jeroen Breebaart,
Jeremy Stoddard,
Vidhyasaharan Sethu,
Eliathamby Ambikairajah
Abstract:
Estimating frequency-varying acoustic parameters is essential for enhancing immersive perception in realistic spatial audio creation. In this paper, we propose a unified framework that blindly estimates reverberation time (T60), direct-to-reverberant ratio (DRR), and clarity (C50) across 10 frequency bands using first-order Ambisonics (FOA) speech recordings as inputs. The proposed framework utili…
▽ More
Estimating frequency-varying acoustic parameters is essential for enhancing immersive perception in realistic spatial audio creation. In this paper, we propose a unified framework that blindly estimates reverberation time (T60), direct-to-reverberant ratio (DRR), and clarity (C50) across 10 frequency bands using first-order Ambisonics (FOA) speech recordings as inputs. The proposed framework utilizes a novel feature named Spectro-Spatial Covariance Vector (SSCV), efficiently representing temporal, spectral as well as spatial information of the FOA signal. Our models significantly outperform existing single-channel methods with only spectral information, reducing estimation errors by more than half for all three acoustic parameters. Additionally, we introduce FOA-Conv3D, a novel back-end network for effectively utilising the SSCV feature with a 3D convolutional encoder. FOA-Conv3D outperforms the convolutional neural network (CNN) and recurrent convolutional neural network (CRNN) backends, achieving lower estimation errors and accounting for a higher proportion of variance (PoV) for all 3 acoustic parameters.
△ Less
Submitted 12 January, 2025; v1 submitted 5 November, 2024;
originally announced November 2024.
-
Volterra Kernel Identification using Regularized Orthonormal Basis Functions
Authors:
Jeremy G. Stoddard,
James S. Welsh
Abstract:
The Volterra series is a powerful tool in modelling a broad range of nonlinear dynamic systems. However, due to its nonparametric nature, the number of parameters in the series increases rapidly with memory length and series order, with the uncertainty in resulting model estimates increasing accordingly. In this paper, we propose an identification method where the Volterra kernels are estimated in…
▽ More
The Volterra series is a powerful tool in modelling a broad range of nonlinear dynamic systems. However, due to its nonparametric nature, the number of parameters in the series increases rapidly with memory length and series order, with the uncertainty in resulting model estimates increasing accordingly. In this paper, we propose an identification method where the Volterra kernels are estimated indirectly through orthonormal basis function expansions, with regularization applied directly to the expansion coefficients to reduce variance in the final model estimate and provide access to useful models at previously unfeasible series orders. The higher dimensional kernel expansions are regularized using a method that allows smoothness and decay to be imposed on the entire hyper-surface. Numerical examples demonstrate improved Volterra series estimation up to the 4th order using the regularized basis function method.
△ Less
Submitted 19 April, 2018;
originally announced April 2018.
-
Gaussian Process Regression for Generalized Frequency Response Function Estimation
Authors:
Jeremy Stoddard,
Georgios Birpoutsoukis
Abstract:
Kernel-based modeling of dynamic systems has garnered a significant amount of attention in the system identification literature since its introduction to the field. While the method was originally applied to linear impulse response estimation in the time domain, the concepts have since been extended to the frequency domain for estimation of frequency response functions (FRFs), as well as to the es…
▽ More
Kernel-based modeling of dynamic systems has garnered a significant amount of attention in the system identification literature since its introduction to the field. While the method was originally applied to linear impulse response estimation in the time domain, the concepts have since been extended to the frequency domain for estimation of frequency response functions (FRFs), as well as to the estimation of the Volterra series in time domain. In the latter case, smoothness and exponential decay was imposed along the hypersurfaces of the multidimensional impulse responses, allowing lower variance estimates than could be obtained in a simple least squares framework. The Volterra series can also be expressed in a frequency domain context, however there are several competing representations which all possess some unique advantages. Perhaps the most natural representation is the generalized frequency response function (GFRF), which is defined as the multidimensional Fourier transform of the corresponding Volterra kernel in the time-domain series. The representation leads to a series of frequency domain functions with increasing dimension.
△ Less
Submitted 26 October, 2017;
originally announced October 2017.