Skip to main content

Showing 1–44 of 44 results for author: Naylor, P

.
  1. arXiv:2409.18591  [pdf, other

    cs.CV

    Off to new Shores: A Dataset & Benchmark for (near-)coastal Flood Inundation Forecasting

    Authors: Brandon Victor, Mathilde Letard, Peter Naylor, Karim Douch, Nicolas Longépé, Zhen He, Patrick Ebel

    Abstract: Floods are among the most common and devastating natural hazards, imposing immense costs on our society and economy due to their disastrous consequences. Recent progress in weather prediction and spaceborne flood mapping demonstrated the feasibility of anticipating extreme events and reliably detecting their catastrophic effects afterwards. However, these efforts are rarely linked to one another a… ▽ More

    Submitted 27 September, 2024; originally announced September 2024.

    Comments: Accepted at NeurIPS 2024 Datasets & Benchmarks

  2. arXiv:2409.11107  [pdf, other

    eess.AS cs.SD

    Zero Shot Text to Speech Augmentation for Automatic Speech Recognition on Low-Resource Accented Speech Corpora

    Authors: Francesco Nespoli, Daniel Barreda, Patrick A. Naylor

    Abstract: In recent years, automatic speech recognition (ASR) models greatly improved transcription performance both in clean, low noise, acoustic conditions and in reverberant environments. However, all these systems rely on the availability of hundreds of hours of labelled training data in specific acoustic conditions. When such a training dataset is not available, the performance of the system is heavily… ▽ More

    Submitted 17 September, 2024; originally announced September 2024.

    Comments: Accepted to the Asilomar 2023 Conference

  3. arXiv:2407.06342  [pdf, other

    eess.AS

    XANE Background Acoustic Embeddings: Ablation and Clustering Analysis

    Authors: Dushyant Sharma, James Fosburgh, Sri Harsha Dumpala, Chandramouli Shama Sastri, Stanislav Yu. Kruchinin, Patrick A. Naylor

    Abstract: We explore the recently proposed explainable acoustic neural embedding~(XANE) system that models the background acoustics of a speech signal in a non-intrusive manner. The XANE embeddings are used to estimate specific parameters related to the background acoustic properties of the signal which allows the embeddings to be explainable in terms of those parameters. We perform ablation studies on the… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2406.05199

  4. arXiv:2406.06413  [pdf, ps, other

    math.GT

    Exotic definite four-manifolds with non-cyclic fundamental group

    Authors: Robert Harris, Patrick Naylor, B. Doug Park

    Abstract: We construct infinitely many pairwise non-diffeomorphic smooth structures on a definite $4$-manifold with non-cyclic fundamental group $\mathbb{Z}/2\times \mathbb{Z}/2$.

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 11 pages, comments welcome!

    MSC Class: 57R55; 57K41; 57M10; 14E20

  5. arXiv:2406.05199  [pdf, other

    eess.AS cs.SD

    XANE: eXplainable Acoustic Neural Embeddings

    Authors: Sri Harsha Dumpala, Dushyant Sharma, Chandramouli Shama Sastri, Stanislav Kruchinin, James Fosburgh, Patrick A. Naylor

    Abstract: We present a novel method for extracting neural embeddings that model the background acoustics of a speech signal. The extracted embeddings are used to estimate specific parameters related to the background acoustic properties of the signal in a non-intrusive manner, which allows the embeddings to be explainable in terms of those parameters. We illustrate the value of these embeddings by performin… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  6. arXiv:2405.02991  [pdf, other

    cs.SD eess.AS

    Steered Response Power for Sound Source Localization: A Tutorial Review

    Authors: Eric Grinstein, Elisa Tengan, Bilgesu Çakmak, Thomas Dietzen, Leonardo Nunes, Toon van Waterschoot, Mike Brookes, Patrick A. Naylor

    Abstract: In the last three decades, the Steered Response Power (SRP) method has been widely used for the task of Sound Source Localization (SSL), due to its satisfactory localization performance on moderately reverberant and noisy scenarios. Many works have analyzed and extended the original SRP method to reduce its computational cost, to allow it to locate multiple sources, or to improve its performance i… ▽ More

    Submitted 9 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

  7. arXiv:2404.05758  [pdf, other

    physics.data-an cs.AI cs.CV cs.LG physics.ao-ph stat.AP

    Implicit Assimilation of Sparse In Situ Data for Dense & Global Storm Surge Forecasting

    Authors: Patrick Ebel, Brandon Victor, Peter Naylor, Gabriele Meoni, Federico Serva, Rochelle Schneider

    Abstract: Hurricanes and coastal floods are among the most disastrous natural hazards. Both are intimately related to storm surges, as their causes and effects, respectively. However, the short-term forecasting of storm surges has proven challenging, especially when targeting previously unseen locations or sites without tidal gauges. Furthermore, recent work improved short and medium-term weather forecastin… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: Accepted at CVPR EarthVision 2024

  8. arXiv:2403.09455  [pdf, other

    cs.SD eess.AS

    The Neural-SRP method for positional sound source localization

    Authors: Eric Grinstein, Toon van Waterschoot, Mike Brookes, Patrick A. Naylor

    Abstract: Steered Response Power (SRP) is a widely used method for the task of sound source localization using microphone arrays, showing satisfactory localization performance on many practical scenarios. However, its performance is diminished under highly reverberant environments. Although Deep Neural Networks (DNNs) have been previously proposed to overcome this limitation, most are trained for a specific… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Presented at Asilomar Conference on Signals, Systems, and Computers

  9. arXiv:2403.05393  [pdf, other

    eess.AS

    Binaural Speech Enhancement Using Deep Complex Convolutional Transformer Networks

    Authors: Vikas Tokala, Eric Grinstein, Mike Brookes, Simon Doclo, Jesper Jensen, Patrick A. Naylor

    Abstract: Studies have shown that in noisy acoustic environments, providing binaural signals to the user of an assistive listening device may improve speech intelligibility and spatial awareness. This paper presents a binaural speech enhancement method using a complex convolutional neural network with an encoder-decoder architecture and a complex multi-head attention transformer. The model is trained to est… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Accepted to ICASSP 2024

  10. arXiv:2312.16763  [pdf, other

    eess.AS cs.SD

    Uncertainty Quantification in Machine Learning for Joint Speaker Diarization and Identification

    Authors: Simon W. McKnight, Aidan O. T. Hogg, Vincent W. Neo, Patrick A. Naylor

    Abstract: This paper studies modulation spectrum features ($Φ$) and mel-frequency cepstral coefficients ($Ψ$) in joint speaker diarization and identification (JSID). JSID is important as speaker diarization on its own to distinguish speakers is insufficient for many applications, it is often necessary to identify speakers as well. Machine learning models are set up using convolutional neural networks (CNNs)… ▽ More

    Submitted 30 December, 2023; v1 submitted 27 December, 2023; originally announced December 2023.

    Comments: 12 pages, 7 figures

  11. arXiv:2311.18689  [pdf, other

    eess.AS cs.SD eess.SP

    Subspace Hybrid MVDR Beamforming for Augmented Hearing

    Authors: Sina Hafezi, Alastair H. Moore, Pierre H. Guiraud, Patrick A. Naylor, Jacob Donley, Vladimir Tourbabin, Thomas Lunner

    Abstract: Signal-dependent beamformers are advantageous over signal-independent beamformers when the acoustic scenario - be it real-world or simulated - is straightforward in terms of the number of sound sources, the ambient sound field and their dynamics. However, in the context of augmented reality audio using head-worn microphone arrays, the acoustic scenarios encountered are often far from straightforwa… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: 14 pages, 10 figures, submitted for IEEE/ACM Transactions on Audio, Speech, and Language Processing on 23-Nov-2023

  12. Deep-Learning-based Change Detection with Spaceborne Hyperspectral PRISMA data

    Authors: J. F. Amieva, A. Austoni, M. A. Brovelli, L. Ansalone, P. Naylor, F. Serva, B. Le Saux

    Abstract: Change detection (CD) methods have been applied to optical data for decades, while the use of hyperspectral data with a fine spectral resolution has been rarely explored. CD is applied in several sectors, such as environmental monitoring and disaster management. Thanks to the PRecursore IperSpettrale della Missione operativA (PRISMA), hyperspectral-from-space CD is now possible. In this work, we a… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: Accepted at Big Data from Space 2023 (BiDS); 4 pages, 4 figures

    Journal ref: Proceedings of the 2023 conference on Big Data from Space

  13. arXiv:2308.04169  [pdf, other

    cs.SD cs.LG eess.AS

    Dual input neural networks for positional sound source localization

    Authors: Eric Grinstein, Vincent W. Neo, Patrick A. Naylor

    Abstract: In many signal processing applications, metadata may be advantageously used in conjunction with a high dimensional signal to produce a desired output. In the case of classical Sound Source Localization (SSL) algorithms, information from a high dimensional, multichannel audio signals received by many distributed microphones is combined with information describing acoustic properties of the scene, s… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  14. arXiv:2307.15428  [pdf, other

    cs.CV cs.LG eess.IV

    Implicit neural representation for change detection

    Authors: Peter Naylor, Diego Di Carlo, Arianna Traviglia, Makoto Yamada, Marco Fiorucci

    Abstract: Identifying changes in a pair of 3D aerial LiDAR point clouds, obtained during two distinct time periods over the same geographic region presents a significant challenge due to the disparities in spatial coverage and the presence of noise in the acquisition system. The most commonly used approaches to detecting changes in point clouds are based on supervised methods which necessitate extensive lab… ▽ More

    Submitted 30 August, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: Main article is 10 pages + 6 pages of supplementary. Conference style paper

  15. arXiv:2307.06388  [pdf, other

    math.GT

    Doubles of Gluck twists: a five dimensional approach

    Authors: David Gabai, Patrick Naylor, Hannah Schwartz

    Abstract: Using a 5-dimensional perspective, we balance algebraic and geometric handle cancellation to show that doubles of Gluck twists of certain 2-spheres with two minima are standard. This includes all 2-spheres which are unions of ribbon discs, one of which has undisking number one. As an application, we produce new examples of Schoenflies balls not known to be standard.

    Submitted 12 July, 2023; originally announced July 2023.

    Comments: 24 pages, 21 figures. Comments welcome!

  16. arXiv:2306.16081  [pdf, other

    cs.SD eess.AS

    Graph neural networks for sound source localization on distributed microphone networks

    Authors: Eric Grinstein, Mike Brookes, Patrick A. Naylor

    Abstract: Distributed Microphone Arrays (DMAs) present many challenges with respect to centralized microphone arrays. An important requirement of applications on these arrays is handling a variable number of input channels. We consider the use of Graph Neural Networks (GNNs) as a solution to this challenge. We present a localization method using the Relation Network GNN, which we show shares many similariti… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: Presented as a poster at ICASSP 2023

  17. arXiv:2306.16071  [pdf, other

    eess.AS cs.CL cs.SD

    Long-term Conversation Analysis: Exploring Utility and Privacy

    Authors: Francesco Nespoli, Jule Pohlhausen, Patrick A. Naylor, Joerg Bitzer

    Abstract: The analysis of conversations recorded in everyday life requires privacy protection. In this contribution, we explore a privacy-preserving feature extraction method based on input feature dimension reduction, spectral smoothing and the low-cost speaker anonymization technique based on McAdams coefficient. We assess the utility of the feature extraction methods with a voice activity detection and a… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: Submitted to ITG Conference on Speech Communication, 2023

  18. arXiv:2306.16069  [pdf, other

    eess.AS cs.SD eess.SP

    Two-Stage Voice Anonymization for Enhanced Privacy

    Authors: Francesco Nespoli, Daniel Barreda, Joerg Bitzer, Patrick A. Naylor

    Abstract: In recent years, the need for privacy preservation when manipulating or storing personal data, including speech , has become a major issue. In this paper, we present a system addressing the speaker-level anonymization problem. We propose and evaluate a two-stage anonymization pipeline exploiting a state-of-the-art anonymization model described in the Voice Privacy Challenge 2022 in combination wit… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: submitted to INTERSPEECH

  19. arXiv:2303.08967  [pdf, other

    eess.AS eess.SP

    Subspace Hybrid Beamforming for Head-worn Microphone Arrays

    Authors: Sina Hafezi, Alastair H. Moore, Pierre Guiraud, Patrick A. Naylor, Jacob Donley, Vladimir Tourbabin, Thomas Lunner

    Abstract: A two-stage multi-channel speech enhancement method is proposed which consists of a novel adaptive beamformer, Hybrid Minimum Variance Distortionless Response (MVDR), Isotropic-MVDR (Iso), and a novel multi-channel spectral Principal Components Analysis (PCA) denoising. In the first stage, the Hybrid-MVDR performs multiple MVDRs using a dictionary of pre-defined noise field models and picks the mi… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Comments: 5 pages, 4 figures, accepted for ICASSP 2023

  20. Optimal Transport for Change Detection on LiDAR Point Clouds

    Authors: Marco Fiorucci, Peter Naylor, Makoto Yamada

    Abstract: Unsupervised change detection between airborne LiDAR data points, taken at separate times over the same location, can be difficult due to unmatching spatial support and noise from the acquisition system. Most current approaches to detect changes in point clouds rely heavily on the computation of Digital Elevation Models (DEM) images and supervised methods. Obtaining a DEM leads to LiDAR informatio… ▽ More

    Submitted 8 November, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No101027956. Marie Skłodowska-Curie Action (Individual Fellowship): OPtimal Transport for Identifying Marauder Activities on LiDAR (OPTIMAL) https://cordis.europa.eu/project/id/101027956

    Journal ref: IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium, Pasadena, CA, USA, 2023, pp. 982-985

  21. arXiv:2212.01306  [pdf, other

    eess.AS cs.SD

    Relative Acoustic Features for Distance Estimation in Smart-Homes

    Authors: Francesco Nespoli, Daniel Barreda, Patrick A. Naylor

    Abstract: Any audio recording encapsulates the unique fingerprint of the associated acoustic environment, namely the background noise and reverberation. Considering the scenario of a room equipped with a fixed smart speaker device with one or more microphones and a wearable smart device (watch, glasses or smartphone), we employed the improved proportionate normalized least mean square adaptive filter to est… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Journal ref: Interspeech 2022

  22. arXiv:2209.15472  [pdf, other

    eess.AS eess.SP

    Binaural Speech Enhancement Using STOI-Optimal Masks

    Authors: Vikas Tokala, Mike Brookes, Patrick A. Naylor

    Abstract: STOI-optimal masking has been previously proposed and developed for single-channel speech enhancement. In this paper, we consider the extension to the task of binaural speech enhancement in which spatial information is known to be important to speech understanding and therefore should be preserved by the enhancement processing. Masks are estimated for each of the binaural channels individually and… ▽ More

    Submitted 30 September, 2022; originally announced September 2022.

    Comments: Accepted at IWAENC 2022

  23. arXiv:2207.10950  [pdf, other

    cs.CV

    Scale dependant layer for self-supervised nuclei encoding

    Authors: Peter Naylor, Yao-Hung Hubert Tsai, Marick Laé, Makoto Yamada

    Abstract: Recent developments in self-supervised learning give us the possibility to further reduce human intervention in multi-step pipelines where the focus evolves around particular objects of interest. In the present paper, the focus lays in the nuclei in histopathology images. In particular we aim at extracting cellular information in an unsupervised manner for a downstream task. As nuclei present them… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

    Comments: 13 pages, 6 figures, 2 tables

    MSC Class: 68Uxx ACM Class: F.2.2; I.2.7

  24. arXiv:2203.13919  [pdf

    eess.AS cs.AI

    Spatial Processing Front-End For Distant ASR Exploiting Self-Attention Channel Combinator

    Authors: Dushyant Sharma, Rong Gong, James Fosburgh, Stanislav Yu. Kruchinin, Patrick A. Naylor, Ljubomir Milanovic

    Abstract: We present a novel multi-channel front-end based on channel shortening with theWeighted Prediction Error (WPE) method followed by a fixed MVDR beamformer used in combination with a recently proposed self-attention-based channel combination (SACC) scheme, for tackling the distant ASR problem. We show that the proposed system used as part of a ContextNet based end-to-end (E2E) ASR system outperforms… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

    Comments: to be presented at ICASSP 2022

  25. arXiv:2010.07433  [pdf, other

    math.GT

    Trisections of non-orientable 4-manifolds

    Authors: Maggie Miller, Patrick Naylor

    Abstract: We study trisections of smooth, compact non-orientable 4-manifolds, and introduce trisections of non-orientable 4-manifolds with boundary. In particular, we prove a non-orientable analogue of a classical theorem of Laudenbach-Poénaru. As a consequence, trisection diagrams and Kirby diagrams of closed non-orientable 4-manifolds exist. We discuss how the theory of trisections may be adapted to the s… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: 42 pages, 20 figures

  26. arXiv:2010.03057  [pdf, other

    math.GT

    Multisections of 4-manifolds

    Authors: Gabriel Islambouli, Patrick Naylor

    Abstract: We introduce multisections of smooth, closed 4-manifolds, which generalize trisections to decompositions with more than three pieces. This decomposition describes an arbitrary smooth, closed 4-manifold as a sequence of cut systems on a surface. We show how to carry out many smooth cut and paste operations in terms of these cut systems. In particular, we show how to implement a cork twist, whereby… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: 29 pages, 26 figures. Comments welcome!

  27. Gluck twisting roll spun knots

    Authors: Patrick Naylor, Hannah Schwartz

    Abstract: We show that the smooth homotopy 4-sphere obtained by Gluck twisting the m-twist n-roll spin of any unknotting number one knot is diffeomorphic to the standard 4-sphere, for any pair of integers (m,n). It follows as a corollary that an infinite collection of twisted doubles of Gompf's infinite order corks are standard.

    Submitted 11 September, 2020; originally announced September 2020.

    Comments: Comments welcome!

    Journal ref: Algebr. Geom. Topol. 22 (2022) 973-990

  28. arXiv:2004.12745  [pdf, other

    eess.AS cs.SD

    Time-Frequency Analysis and Parameterisation of Knee Sounds for Non-invasive Detection of Osteoarthritis

    Authors: Costas Yiallourides, Patrick A. Naylor

    Abstract: Objective: In this work the potential of non-invasive detection of knee osteoarthritis is investigated using the sounds generated by the knee joint during walking. Methods: The information contained in the time-frequency domain of these signals and its compressed representations is exploited and their discriminant properties are studied. Their efficacy for the task of normal vs abnormal signal cla… ▽ More

    Submitted 27 April, 2020; originally announced April 2020.

    Comments: Submitted to IEEE Transactions on Biomedical Engineering

  29. arXiv:2001.00473  [pdf, ps, other

    cs.SD cs.CL eess.AS

    Detection of Glottal Closure Instants from Speech Signals: a Quantitative Review

    Authors: Thomas Drugman, Mark Thomas, Jon Gudnason, Patrick Naylor, Thierry Dutoit

    Abstract: The pseudo-periodicity of voiced speech can be exploited in several speech processing applications. This requires however that the precise locations of the Glottal Closure Instants (GCIs) are available. The focus of this paper is the evaluation of automatic methods for the detection of GCIs directly from the speech waveform. Five state-of-the-art GCI detection algorithms are compared using six dif… ▽ More

    Submitted 28 December, 2019; originally announced January 2020.

  30. arXiv:1909.01008  [pdf, other

    eess.AS cs.SD eess.SP

    The LOCATA Challenge: Acoustic Source Localization and Tracking

    Authors: Christine Evers, Heinrich Loellmann, Heinrich Mellmann, Alexander Schmidt, Hendrik Barfuss, Patrick Naylor, Walter Kellermann

    Abstract: The ability to localize and track acoustic events is a fundamental prerequisite for equipping machines with the ability to be aware of and engage with humans in their surrounding environment. However, in realistic scenarios, audio signals are adversely affected by reverberation, noise, interference, and periods of speech inactivity. In dynamic scenarios, where the sources and microphone platforms… ▽ More

    Submitted 21 October, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing

  31. arXiv:1906.01495  [pdf, other

    math.GT

    Trisection diagrams and twists of 4-manifolds

    Authors: Patrick Naylor

    Abstract: A theorem of Katanaga, Saeki, Teragaito, and Yamada relates Gluck and Price twists of 4-manifolds. Using trisection diagrams, we give a purely diagrammatic proof of this theorem, and answer a question of Kim and Miller.

    Submitted 26 March, 2022; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: 23 pages, 26 figures. This version has been accepted for publication in Comptes Rendus Mathématique

  32. arXiv:1901.05852  [pdf, other

    eess.AS cs.SD

    Detecting Sound-Absorbing Materials in a Room from a Single Impulse Response using a CRNN

    Authors: Constantinos Papayiannis, Christine Evers, Patrick A. Naylor

    Abstract: The materials of surfaces in a room play an important room in shaping the auditory experience within them. Different materials absorb energy at different levels. The level of absorption also varies across frequencies. This paper investigates how cues from a measured impulse response in the room can be exploited by machines to detect the materials present. With this motivation, this paper proposes… ▽ More

    Submitted 27 October, 2019; v1 submitted 17 January, 2019; originally announced January 2019.

    Comments: Submitted for review for IEEE ICASSP 2020

  33. arXiv:1901.03257  [pdf, other

    eess.AS cs.SD

    Data Augmentation of Room Classifiers using Generative Adversarial Networks

    Authors: Constantinos Papayiannis, Christine Evers, Patrick A. Naylor

    Abstract: The classification of acoustic environments allows for machines to better understand the auditory world around them. The use of deep learning in order to teach machines to discriminate between different rooms is a new area of research. Similarly to other learning tasks, this task suffers from the high-dimensionality and the limited availability of training data. Data augmentation methods have prov… ▽ More

    Submitted 4 December, 2020; v1 submitted 10 January, 2019; originally announced January 2019.

    Comments: Submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processing

  34. arXiv:1812.09324  [pdf, other

    eess.AS cs.SD

    End-to-End Classification of Reverberant Rooms using DNNs

    Authors: Constantinos Papayiannis, Christine Evers, Patrick A. Naylor

    Abstract: Reverberation is present in our workplaces, our homes, concert halls and theatres. This paper investigates how deep learning can use the effect of reverberation on speech to classify a recording in terms of the room in which it was recorded. Existing approaches in the literature rely on domain expertise to manually select acoustic parameters as inputs to classifiers. Estimation of these parameters… ▽ More

    Submitted 1 November, 2020; v1 submitted 21 December, 2018; originally announced December 2018.

    Comments: Accepted for publication in IEEE/ACM Transactions on Audio, Speech, and Language Processing

  35. arXiv:1811.08482   

    eess.AS cs.SD

    Proceedings of the LOCATA Challenge Workshop -- a satellite event of IWAENC 2018

    Authors: Heinrich W. Loellmann, Christine Evers, Alexander Schmidt, Hendrik Barfuss, Patrick A. Naylor, Walter Kellermann

    Abstract: Algorithms for acoustic source localization and tracking provide estimates of the positional information about active sound sources in acoustic environments and are essential for a wide range of applications such as personal assistants, smart homes, tele-conferencing systems, hearing aids, or autonomous systems. The aim of the IEEE-AASP Challenge on sound source localization and tracking (LOCATA)… ▽ More

    Submitted 20 August, 2019; v1 submitted 20 November, 2018; originally announced November 2018.

    Comments: Workshop Proceedings

  36. arXiv:1606.03365  [pdf, other

    cs.SD

    Acoustic Characterization of Environments (ACE) Challenge Results Technical Report

    Authors: James Eaton, Nikolay D. Gaubitch, Alastair H. Moore, Patrick A. Naylor

    Abstract: This document provides the results of the tests of acoustic parameter estimation algorithms on the Acoustic Characterization of Environments (ACE) Challenge Evaluation dataset which were subsequently submitted and written up into papers for the Proceedings of the ACE Challenge. This document is supporting material for a forthcoming journal paper on the ACE Challenge which will provide further anal… ▽ More

    Submitted 27 June, 2017; v1 submitted 17 December, 2015; originally announced June 2016.

    Comments: Supporting material for Proceedings of the ACE Challenge Workshop - a satellite event of IEEE-WASPAA 2015 (arXiv:1510.00383)

  37. arXiv:1510.07546  [pdf, other

    cs.SD

    Direct-to-Reverberant Ratio Estimation on the ACE Corpus Using a Two-channel Beamformer

    Authors: James Eaton, Patrick A. Naylor

    Abstract: Direct-to-Reverberant Ratio (DRR) is an important measure for characterizing the properties of a room. The recently proposed DRR Estimation using a Null-Steered Beamformer (DENBE) algorithm was originally tested on simulated data where noise was artificially added to the speech after convolution with impulse responses simulated using the image-source method. This paper evaluates the performance of… ▽ More

    Submitted 26 October, 2015; originally announced October 2015.

    Comments: In Proceedings of the ACE Challenge Workshop - a satellite event of IEEE-WASPAA 2015 (arXiv:1510.00383). arXiv admin note: text overlap with arXiv:1510.01193

    Report number: ACEChallenge/2015/03

  38. arXiv:1510.04616  [pdf, ps, other

    cs.SD

    Evaluating the Non-Intrusive Room Acoustics Algorithm with the ACE Challenge

    Authors: Pablo Peso Parada, Dushyant Sharma, Toon van Waterschoot, Patrick A. Naylor

    Abstract: We present a single channel data driven method for non-intrusive estimation of full-band reverberation time and full-band direct-to-reverberant ratio. The method extracts a number of features from reverberant speech and builds a model using a recurrent neural network to estimate the reverberant acoustic parameters. We explore three configurations by including different data and also by combining t… ▽ More

    Submitted 15 October, 2015; originally announced October 2015.

    Comments: In Proceedings of the ACE Challenge Workshop - a satellite event of IEEE-WASPAA 2015 (arXiv:1510.00383)

    Report number: ACEChallenge/2015/06

  39. arXiv:1510.01193  [pdf, other

    cs.SD

    Reverberation time estimation on the ACE corpus using the SDD method

    Authors: James Eaton, Patrick A. Naylor

    Abstract: Reverberation Time (T60) is an important measure for characterizing the properties of a room. The author's T60 estimation algorithm was previously tested on simulated data where the noise is artificially added to the speech after convolution with a impulse responses simulated using the image method. We test the algorithm on speech convolved with real recorded impulse responses and noise from the s… ▽ More

    Submitted 5 October, 2015; originally announced October 2015.

    Comments: In Proceedings of the ACE Challenge Workshop - a satellite event of IEEE-WASPAA 2015 (arXiv:1510.00383)

    Report number: ACEChallenge/2015/02

  40. arXiv:1510.00383  other

    cs.SD

    Proceedings of the ACE Challenge Workshop - a satellite event of IEEE-WASPAA (2015)

    Authors: James Eaton, Nikolay D. Gaubitch, Alastair H. Moore, Patrick A. Naylor

    Abstract: Several established parameters and metrics have been used to characterize the acoustics of a room. The most important are the Direct-To-Reverberant Ratio (DRR), the Reverberation Time (T60) and the reflection coefficient. The acoustic characteristics of a room based on such parameters can be used to predict the quality and intelligibility of speech signals in that room. Recently, several important… ▽ More

    Submitted 1 October, 2015; originally announced October 2015.

    Comments: New Paltz, New York, USA

  41. Source Coding in Networks with Covariance Distortion Constraints

    Authors: Adel Zahedi, Jan Østergaard, Søren Holdt Jensen, Patrick A. Naylor, Søren Bech

    Abstract: We consider a source coding problem with a network scenario in mind, and formulate it as a remote vector Gaussian Wyner-Ziv problem under covariance matrix distortions. We define a notion of minimum for two positive-definite matrices based on which we derive an explicit formula for the rate-distortion function (RDF). We then study the special cases and applications of this result. We show that two… ▽ More

    Submitted 27 September, 2016; v1 submitted 5 April, 2015; originally announced April 2015.

  42. arXiv:1410.5774  [pdf, other

    math.AT math.GT

    Testing bi-orderability of knot groups

    Authors: Adam Clay, Colin Desmarais, Patrick Naylor

    Abstract: We investigate the bi-orderability of two-bridge knot groups and the groups of knots with 12 or fewer crossings by applying recent theorems of Chiswell, Glass and Wilson. Amongst all knots with 12 or fewer crossings (of which there are 2977), previous theorems were only able to determine bi-orderability of 599 of the corresponding knot groups. With our methods we are able to deal with 191 more.

    Submitted 21 October, 2014; originally announced October 2014.

    Comments: 10 pages, 1 figure

    MSC Class: 57M25; 57M27; 06F15

    Journal ref: Can. Math. Bull. 59 (2016) 472-482

  43. arXiv:1401.6136  [pdf, other

    cs.IT

    Distributed Remote Vector Gaussian Source Coding with Covariance Distortion Constraints

    Authors: Adel Zahedi, Jan Ostergaard, Soren Holdt Jensen, Patrick Naylor, Soren Bech

    Abstract: In this paper, we consider a distributed remote source coding problem, where a sequence of observations of source vectors is available at the encoder. The problem is to specify the optimal rate for encoding the observations subject to a covariance matrix distortion constraint and in the presence of side information at the decoder. For this problem, we derive lower and upper bounds on the rate-dist… ▽ More

    Submitted 4 June, 2014; v1 submitted 23 January, 2014; originally announced January 2014.

    Comments: This is the final version accepted at ISIT'14

  44. Distributed Remote Vector Gaussian Source Coding for Wireless Acoustic Sensor Networks

    Authors: Adel Zahedi, Jan Ostergaard, Soren Holdt Jensen, Patrick Naylor, Soren Bech

    Abstract: In this paper, we consider the problem of remote vector Gaussian source coding for a wireless acoustic sensor network. Each node receives messages from multiple nodes in the network and decodes these messages using its own measurement of the sound field as side information. The node's measurement and the estimates of the source resulting from decoding the received messages are then jointly encoded… ▽ More

    Submitted 16 January, 2014; originally announced January 2014.

    Comments: 10 pages, to be presented at the IEEE DCC'14