Skip to main content

Showing 1–13 of 13 results for author: Zuccarello, P

Searching in archive eess. Search in all archives.
.
  1. arXiv:2503.11206  [pdf, other

    cs.SD cs.ET eess.AS

    Comparative Study of Spike Encoding Methods for Environmental Sound Classification

    Authors: Andres Larroza, Javier Naranjo-Alcazar, Vicent Ortiz Castelló, Pedro Zuccarello

    Abstract: Spiking Neural Networks (SNNs) offer a promising approach to reduce energy consumption and computational demands, making them particularly beneficial for embedded machine learning in edge applications. However, data from conventional digital sensors must first be converted into spike trains to be processed using neuromorphic computing technologies. The classification of environmental sounds presen… ▽ More

    Submitted 2 April, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

    Comments: Under review EUSIPCO 2025

  2. arXiv:2405.18153  [pdf, other

    cs.SD cs.LG eess.AS

    A Data-Centric Framework for Machine Listening Projects: Addressing Large-Scale Data Acquisition and Labeling through Active Learning

    Authors: Javier Naranjo-Alcazar, Jordi Grau-Haro, Ruben Ribes-Serrano, Pedro Zuccarello

    Abstract: Machine Listening focuses on developing technologies to extract relevant information from audio signals. A critical aspect of these projects is the acquisition and labeling of contextualized data, which is inherently complex and requires specific resources and strategies. Despite the availability of some audio datasets, many are unsuitable for commercial applications. The paper emphasizes the impo… ▽ More

    Submitted 8 October, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Paper accepted at 8th Future of Information and Communication Conference 2025, 28-29 April, Berlin

  3. arXiv:2306.10843  [pdf, other

    cs.SD cs.LG eess.AS

    Female mosquito detection by means of AI techniques inside release containers in the context of a Sterile Insect Technique program

    Authors: Javier Naranjo-Alcazar, Jordi Grau-Haro, David Almenar, Pedro Zuccarello

    Abstract: The Sterile Insect Technique (SIT) is a biological pest control technique based on the release into the environment of sterile males of the insect species whose population is to be controlled. The entire SIT process involves mass-rearing within a biofactory, sorting of the specimens by sex, sterilization, and subsequent release of the sterile males into the environment. The reason for avoiding the… ▽ More

    Submitted 31 May, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

    Comments: Accepted EUSIPCO 2024

  4. arXiv:2206.08007  [pdf, ps, other

    cs.SD cs.LG eess.AS

    DCASE 2022: Comparative Analysis Of CNNs For Acoustic Scene Classification Under Low-Complexity Considerations

    Authors: Josep Zaragoza-Paredes, Javier Naranjo-Alcazar, Valery Naranjo, Pedro Zuccarello

    Abstract: Acoustic scene classification is an automatic listening problem that aims to assign an audio recording to a pre-defined scene based on its audio data. Over the years (and in past editions of the DCASE) this problem has often been solved with techniques known as ensembles (use of several machine learning models to combine their predictions in the inference phase). While these solutions can show per… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

  5. arXiv:2107.14658  [pdf, ps, other

    cs.SD cs.LG eess.AS

    Task 1A DCASE 2021: Acoustic Scene Classification with mismatch-devices using squeeze-excitation technique and low-complexity constraint

    Authors: Javier Naranjo-Alcazar, Sergi Perez-Castanos, Maximo Cobos, Francesc J. Ferri, Pedro Zuccarello

    Abstract: Acoustic scene classification (ASC) is one of the most popular problems in the field of machine listening. The objective of this problem is to classify an audio clip into one of the predefined scenes using only the audio data. This problem has considerably progressed over the years in the different editions of DCASE. It usually has several subtasks that allow to tackle this problem with different… ▽ More

    Submitted 30 July, 2021; originally announced July 2021.

    Comments: Submitted to Task 1a DCASE 2021 Challenge

  6. arXiv:2107.14561  [pdf, other

    cs.SD cs.LG eess.AS

    TASK3 DCASE2021 Challenge: Sound event localization and detection using squeeze-excitation residual CNNs

    Authors: Javier Naranjo-Alcazar, Sergi Perez-Castanos, Pedro Zuccarello, Francesc J. Ferri, Maximo Cobos

    Abstract: Sound event localisation and detection (SELD) is a problem in the field of automatic listening that aims at the temporal detection and localisation (direction of arrival estimation) of sound events within an audio clip, usually of long duration. Due to the amount of data present in the datasets related to this problem, solutions based on deep learning have positioned themselves at the top of the s… ▽ More

    Submitted 30 July, 2021; originally announced July 2021.

    Comments: Submitted to Task3 DCASE Challenge 2021

  7. arXiv:2107.13180  [pdf, other

    cs.MM cs.CV cs.SD eess.AS eess.IV

    Squeeze-Excitation Convolutional Recurrent Neural Networks for Audio-Visual Scene Classification

    Authors: Javier Naranjo-Alcazar, Sergi Perez-Castanos, Aaron Lopez-Garcia, Pedro Zuccarello, Maximo Cobos, Francesc J. Ferri

    Abstract: The use of multiple and semantically correlated sources can provide complementary information to each other that may not be evident when working with individual modalities on their own. In this context, multi-modal models can help producing more accurate and robust predictions in machine learning tasks where audio-visual data is available. This paper presents a multi-modal model for automatic scen… ▽ More

    Submitted 28 July, 2021; originally announced July 2021.

  8. arXiv:2006.15406  [pdf, other

    eess.AS cs.LG cs.SD

    Listen carefully and tell: an audio captioning system based on residual learning and gammatone audio representation

    Authors: Sergi Perez-Castanos, Javier Naranjo-Alcazar, Pedro Zuccarello, Maximo Cobos

    Abstract: Automated audio captioning is machine listening task whose goal is to describe an audio using free text. An automated audio captioning system has to be implemented as it accepts an audio as input and outputs as textual description, that is, the caption of the signal. This task can be useful in many applications such as automatic content description or machine-to-machine interaction. In this work,… ▽ More

    Submitted 8 July, 2020; v1 submitted 27 June, 2020; originally announced June 2020.

    Comments: Submitted to DCASE2020 Workshop, Workshop on Detection and Classification of Acoustic Scenes and Events

  9. arXiv:2006.15321  [pdf, other

    eess.AS cs.LG cs.SD

    Anomalous Sound Detection using unsupervised and semi-supervised autoencoders and gammatone audio representation

    Authors: Sergi Perez-Castanos, Javier Naranjo-Alcazar, Pedro Zuccarello, Maximo Cobos

    Abstract: Anomalous sound detection (ASD) is, nowadays, one of the topical subjects in machine listening discipline. Unsupervised detection is attracting a lot of interest due to its immediate applicability in many fields. For example, related to industrial processes, the early detection of malfunctions or damage in machines can mean great savings and an improvement in the efficiency of industrial processes… ▽ More

    Submitted 27 June, 2020; originally announced June 2020.

    Comments: Submitted to DCASE2020 Workshop, Workshop on Detection and Classification of Acoustic Scenes and Events

  10. arXiv:2006.14436  [pdf, other

    cs.SD eess.AS

    Sound Event Localization and Detection using Squeeze-Excitation Residual CNNs

    Authors: Javier Naranjo-Alcazar, Sergi Perez-Castanos, Jose Ferrandis, Pedro Zuccarello, Maximo Cobos

    Abstract: Sound Event Localization and Detection (SELD) is a problem related to the field of machine listening whose objective is to recognize individual sound events, detect their temporal activity, and estimate their spatial location. Thanks to the emergence of more hard-labeled audio datasets, deep learning techniques have become state-of-the-art solutions. The most common ones are those that implement a… ▽ More

    Submitted 30 July, 2021; v1 submitted 25 June, 2020; originally announced June 2020.

    Comments: Accepted in URSI 2021, Vigo, Spain

  11. arXiv:2003.09284  [pdf, other

    cs.SD cs.LG eess.AS

    Acoustic Scene Classification with Squeeze-Excitation Residual Networks

    Authors: Javier Naranjo-Alcazar, Sergi Perez-Castanos, Pedro Zuccarello, Maximo Cobos

    Abstract: Acoustic scene classification (ASC) is a problem related to the field of machine listening whose objective is to classify/tag an audio clip in a predefined label describing a scene location (e. g. park, airport, etc.). Many state-of-the-art solutions to ASC incorporate data augmentation techniques and model ensembles. However, considerable improvements can also be achieved only by modifying the ar… ▽ More

    Submitted 26 June, 2020; v1 submitted 20 March, 2020; originally announced March 2020.

    Journal ref: IEEEAccess 2020

  12. arXiv:1906.10891  [pdf, other

    cs.SD cs.LG eess.AS

    On the performance of residual block design alternatives in convolutional neural networks for end-to-end audio classification

    Authors: Javier Naranjo-Alcazar, Sergi Perez-Castanos, Irene Martin-Morato, Pedro Zuccarello, Maximo Cobos

    Abstract: Residual learning is a recently proposed learning framework to facilitate the training of very deep neural networks. Residual blocks or units are made of a set of stacked layers, where the inputs are added back to their outputs with the aim of creating identity mappings. In practice, such identity mappings are accomplished by means of the so-called skip or residual connections. However, multiple i… ▽ More

    Submitted 26 September, 2019; v1 submitted 26 June, 2019; originally announced June 2019.

  13. arXiv:1906.04591  [pdf, ps, other

    cs.SD cs.LG eess.AS

    CNN depth analysis with different channel inputs for Acoustic Scene Classification

    Authors: Sergi Perez-Castanos, Javier Naranjo-Alcazar, Pedro Zuccarello, Maximo Cobos, Frances J. Ferri

    Abstract: Acoustic scene classification (ASC) has been approached in the last years using deep learning techniques such as convolutional neural networks or recurrent neural networks. Many state-of-the-art solutions are based on image classification frameworks and, as such, a 2D representation of the audio signal is considered for training these networks. Finding the most suitable audio representation is sti… ▽ More

    Submitted 13 August, 2021; v1 submitted 10 June, 2019; originally announced June 2019.

    Comments: Accepted at URSI 2020, Malaga, Spain