-
Analytical model for the relation between signal bandwidth and spatial resolution in Steered-Response Power Phase Transform (SRP-PHAT) maps
Authors:
Guillermo Garcia-Barrios,
Juana M. Gutierrez-Arriola,
Nicolas Saenz-Lechon,
Victor Jose Osma-Ruiz,
Ruben Fraile
Abstract:
An analysis of the relationship between the bandwidth of acoustic signals and the required resolution of steered-response power phase transform (SRP-PHAT) maps used for sound source localization is presented. This relationship does not rely on the far-field assumption, nor does it depend on any specific array topology. The proposed analysis considers the computation of a SRP map as a process of sa…
▽ More
An analysis of the relationship between the bandwidth of acoustic signals and the required resolution of steered-response power phase transform (SRP-PHAT) maps used for sound source localization is presented. This relationship does not rely on the far-field assumption, nor does it depend on any specific array topology. The proposed analysis considers the computation of a SRP map as a process of sampling a set of generalized cross-correlation (GCC) functions, each one corresponding to a different microphone pair. From this approach, we derive a rule that relates GCC bandwidth with inter-microphone distance, resolution of the SRP map, and the potential position of the sound source relative to the array position. This rule is a sufficient condition for an aliasing-free calculation of the specified SRP-PHAT map. Simulation results show that limiting the bandwidth of the GCC according to such rule leads to significant reductions in sound source localization errors when sources are not in the immediate vicinity of the microphone array. These error reductions are more relevant for coarser resolutions of the SRP map, and they happen in both anechoic and reverberant environments.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Exploiting spatial diversity for increasing the robustness of sound source localization systems against reverberation
Authors:
Guillermo Garcia-Barrios,
Eduardo Latorre Iglesias,
Juana M. Gutierrez-Arriola,
Ruben Fraile,
Nicolas Saenz-Lechon,
Victor Jose Osma-Ruiz
Abstract:
Acoustic reverberation is one of the most relevant factors that hampers the localization of a sound source inside a room. To date, several approaches have been proposed to deal with it, but have not always been evaluated under realistic conditions. This paper proposes exploiting spatial diversity as an alternative approach to achieve robustness against reverberation. The theoretical arguments supp…
▽ More
Acoustic reverberation is one of the most relevant factors that hampers the localization of a sound source inside a room. To date, several approaches have been proposed to deal with it, but have not always been evaluated under realistic conditions. This paper proposes exploiting spatial diversity as an alternative approach to achieve robustness against reverberation. The theoretical arguments supporting this approach are first presented and later confirmed by means of simulation results and real measurements. Simulations are run for reverberation times up to 2 s, thus providing results with a wider range of validity than in other previous research works. It is concluded that the use of systems consisting of several, sufficiently separated, small arrays leads to the best results in reverberant environments. Some recommendations are given regarding the choice of the array sizes, the separation among them, and the way to combine SRP-PHAT maps obtained from diverse arrays.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
A Transversal Study of Fundamental Frequency Contours in Parkinsonian Voices
Authors:
Pablo Rodriguez-Perez,
Ruben Fraile,
Miguel Garcia-Escrig,
Nicolas Saenz-Lechon,
Juana M. Gutierrez-Arriola,
Victor Osma-Ruiz
Abstract:
A transversal study of the pitch variability of parkinsonian voices in read speech is presented. 30 patients suffering from Parkinson's disease (PD) and 32 healthy speakers were recorded while reading a text without voiceless phonemes. The fundamental frequency contours were calculated from the recordings, and the following measures were used for describing them: mean, minimum, maximum, and standa…
▽ More
A transversal study of the pitch variability of parkinsonian voices in read speech is presented. 30 patients suffering from Parkinson's disease (PD) and 32 healthy speakers were recorded while reading a text without voiceless phonemes. The fundamental frequency contours were calculated from the recordings, and the following measures were used for describing them: mean, minimum, maximum, and standard deviation of the estimated fundamental frequencies. Results based on these measures indicate that the influence of PD on some aspects of intonation can be masked by the effects of aging, especially for male voices. However, some parameters such as the relative fundamental frequency range exhibit lower correlations with age than with PD stage, as evaluated using the Hoehn and Yahr scale. These correlations between relative fundamental frequency range and PD stage reach moderate-to-high values in the case of women. Additionally, three parameters describing the form of the fundamental frequency modulation spectrum were investigated for correlation with age and PD stage. The study of this modulation spectrum provides some insight into the ability of the speakers to plan the intonation of full phrases. For both male and female populations, significant correlations were found between parameters obtained from the modulation spectrum of fundamental frequency and the PD stage. Nevertheless, the quantitative assessment of the performance of regression models built from these modulation parameters and fundamental frequency range suggests that such measures are likely to be of limited value in the early diagnosis of PD due to inter-speaker variability.
△ Less
Submitted 9 February, 2024;
originally announced February 2024.
-
Skin lesion segmentation based on preprocessing, thresholding and neural networks
Authors:
Juana M. Gutiérrez-Arriola,
Marta Gómez-Álvarez,
Victor Osma-Ruiz,
Nicolás Sáenz-Lechón,
Rubén Fraile
Abstract:
This abstract describes the segmentation system used to participate in the challenge ISIC 2017: Skin Lesion Analysis Towards Melanoma Detection. Several preprocessing techniques have been tested for three color representations (RGB, YCbCr and HSV) of 392 images. Results have been used to choose the better preprocessing for each channel. In each case a neural network is trained to predict the Jacca…
▽ More
This abstract describes the segmentation system used to participate in the challenge ISIC 2017: Skin Lesion Analysis Towards Melanoma Detection. Several preprocessing techniques have been tested for three color representations (RGB, YCbCr and HSV) of 392 images. Results have been used to choose the better preprocessing for each channel. In each case a neural network is trained to predict the Jaccard Index based on object characteristics. The system includes black frames and reference circle detection algorithms but no special treatment is done for hair removal. Segmentation is performed in two steps first the best channel to be segmented is chosen by selecting the best neural network output. If this output does not predict a Jaccard Index over 0.5 a more aggressive preprocessing is performed using open and close morphological operations and the segmentation of the channel that obtains the best output from the neural networks is selected as the lesion.
△ Less
Submitted 14 March, 2017;
originally announced March 2017.