Skip to main content

Showing 1–8 of 8 results for author: Denton, T

Searching in archive eess. Search in all archives.
.
  1. arXiv:2505.03071  [pdf, ps, other

    eess.AS

    The Search for Squawk: Agile Modeling in Bioacoustics

    Authors: Vincent Dumoulin, Otilia Stretcu, Jenny Hamer, Lauren Harrell, Rob Laber, Hugo Larochelle, Bart van Merriƫnboer, Amanda Navine, Patrick Hart, Ben Williams, Timothy A. C. Lamont, Tries B. Razak, Mars Coral Restoration Team, Sheryn Brodie, Brendan Doohan, Phil Eichinski, Paul Roe, Lin Schwarzkopf, Tom Denton

    Abstract: Passive acoustic monitoring (PAM) has shown great promise in helping ecologists understand the health of animal populations and ecosystems. However, extracting insights from millions of hours of audio recordings requires the development of specialized recognizers. This is typically a challenging task, necessitating large amounts of training data and machine learning expertise. In this work, we int… ▽ More

    Submitted 10 June, 2025; v1 submitted 5 May, 2025; originally announced May 2025.

  2. arXiv:2404.16436  [pdf

    cs.SD cs.AI cs.LG eess.AS

    Leveraging tropical reef, bird and unrelated sounds for superior transfer learning in marine bioacoustics

    Authors: Ben Williams, Bart van Merriƫnboer, Vincent Dumoulin, Jenny Hamer, Eleni Triantafillou, Abram B. Fleishman, Matthew McKown, Jill E. Munger, Aaron N. Rice, Ashlee Lillis, Clemency E. White, Catherine A. D. Hobbs, Tries B. Razak, Kate E. Jones, Tom Denton

    Abstract: Machine learning has the potential to revolutionize passive acoustic monitoring (PAM) for ecological assessments. However, high annotation and compute costs limit the field's efficacy. Generalizable pretrained networks can overcome these costs, but high-quality pretraining requires vast annotated libraries, limiting its current applicability primarily to bird taxa. Here, we identify the optimum pr… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: 18 pages, 5 figures

  3. arXiv:2402.15360  [pdf, other

    q-bio.QM cs.LG cs.SD eess.AS

    All Thresholds Barred: Direct Estimation of Call Density in Bioacoustic Data

    Authors: Amanda K. Navine, Tom Denton, Matthew J. Weldy, Patrick J. Hart

    Abstract: Passive acoustic monitoring (PAM) studies generate thousands of hours of audio, which may be used to monitor specific animal populations, conduct broad biodiversity surveys, detect threats such as poachers, and more. Machine learning classifiers for species identification are increasingly being used to process the vast amount of audio generated by bioacoustic surveys, expediting analysis and incre… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: 14 pages, 6 figures, 3 tables; submitted to Frontiers in Bird Science; Our Hawaiian PAM dataset and classifier scores, as well as annotation information for the three study species, can be found on Zenodo at https://doi.org/10.5281/zenodo.10581530. The fully annotated Powdermill dataset assembled by Chronister et al. that was used in this study is available at https://doi.org/10.1002/ecy.3329

  4. Global birdsong embeddings enable superior transfer learning for bioacoustic classification

    Authors: Burooj Ghani, Tom Denton, Stefan Kahl, Holger Klinck

    Abstract: Automated bioacoustic analysis aids understanding and protection of both marine and terrestrial animals and their habitats across extensive spatiotemporal scales, and typically involves analyzing vast collections of acoustic data. With the advent of deep learning models, classification of important signals from these datasets has markedly improved. These models power critical data analyses for res… ▽ More

    Submitted 17 November, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

  5. arXiv:2207.02262  [pdf, other

    cs.SD cs.LG eess.AS

    Ultra-Low-Bitrate Speech Coding with Pretrained Transformers

    Authors: Ali Siahkoohi, Michael Chinen, Tom Denton, W. Bastiaan Kleijn, Jan Skoglund

    Abstract: Speech coding facilitates the transmission of speech over low-bandwidth networks with minimal distortion. Neural-network based speech codecs have recently demonstrated significant improvements in quality over traditional approaches. While this new generation of codecs is capable of synthesizing high-fidelity speech, their use of recurrent or convolutional layers often restricts their effective rec… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: Proceedings of INTERSPEECH 2022

  6. arXiv:2110.03209  [pdf, other

    eess.AS

    Improving Bird Classification with Unsupervised Sound Separation

    Authors: Tom Denton, Scott Wisdom, John R. Hershey

    Abstract: This paper addresses the problem of species classification in bird song recordings. The massive amount of available field recordings of birds presents an opportunity to use machine learning to automatically track bird populations. However, it also poses a problem: such field recordings typically contain significant environmental noise and overlapping vocalizations that interfere with classificatio… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

    Comments: 5 pages, 3 figures. Examples available at https://bird-mixit.github.io

  7. arXiv:2102.11906  [pdf, other

    eess.AS cs.SD

    Handling Background Noise in Neural Speech Generation

    Authors: Tom Denton, Alejandro Luebs, Felicia S. C. Lim, Andrew Storus, Hengchin Yeh, W. Bastiaan Kleijn, Jan Skoglund

    Abstract: Recent advances in neural-network based generative modeling of speech has shown great potential for speech coding. However, the performance of such models drops when the input is not clean speech, e.g., in the presence of background noise, preventing its use in practical applications. In this paper we examine the reason and discuss methods to overcome this issue. Placing a denoising preprocessing… ▽ More

    Submitted 23 February, 2021; originally announced February 2021.

    Comments: 5 pages, 3 figures, presented at the Asilomar Conference on Signals, Systems, and Computers 2020

  8. arXiv:2102.09660  [pdf, other

    eess.AS cs.SD

    Generative Speech Coding with Predictive Variance Regularization

    Authors: W. Bastiaan Kleijn, Andrew Storus, Michael Chinen, Tom Denton, Felicia S. C. Lim, Alejandro Luebs, Jan Skoglund, Hengchin Yeh

    Abstract: The recent emergence of machine-learning based generative models for speech suggests a significant reduction in bit rate for speech codecs is possible. However, the performance of generative models deteriorates significantly with the distortions present in real-world input signals. We argue that this deterioration is due to the sensitivity of the maximum likelihood criterion to outliers and the in… ▽ More

    Submitted 18 February, 2021; originally announced February 2021.

    MSC Class: 94 ACM Class: I.m