Skip to main content

Showing 1–5 of 5 results for author: Carofilis, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.04981  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Better Semi-supervised Learning for Multi-domain ASR Through Incremental Retraining and Data Filtering

    Authors: Andres Carofilis, Pradeep Rangappa, Srikanth Madikeri, Shashi Kumar, Sergio Burdisso, Jeena Prakash, Esau Villatoro-Tello, Petr Motlicek, Bidisha Sharma, Kadri Hacioglu, Shankar Venkatesan, Saurabh Vyas, Andreas Stolcke

    Abstract: Fine-tuning pretrained ASR models for specific domains is challenging when labeled data is scarce. But unlabeled audio and labeled data from related domains are often available. We propose an incremental semi-supervised learning pipeline that first integrates a small in-domain labeled set and an auxiliary dataset from a closely related domain, achieving a relative improvement of 4% over no auxilia… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: Accepted at Interspeech 2025, Netherlands

  2. arXiv:2506.03681  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering

    Authors: Pradeep Rangappa, Andres Carofilis, Jeena Prakash, Shashi Kumar, Sergio Burdisso, Srikanth Madikeri, Esau Villatoro-Tello, Bidisha Sharma, Petr Motlicek, Kadri Hacioglu, Shankar Venkatesan, Saurabh Vyas, Andreas Stolcke

    Abstract: Fine-tuning pretrained ASR models for specific domains is challenging for small organizations with limited labeled data and computational resources. Here, we explore different data selection pipelines and propose a robust approach that improves ASR adaptation by filtering pseudo-labels generated using Whisper (encoder-decoder) and Zipformer (transducer) models. Our approach integrates multiple sel… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: Accepted at Interspeech 2025, Netherlands

  3. arXiv:2409.13514  [pdf, other

    cs.CL cs.SD eess.AS

    LM-assisted keyword biasing with Aho-Corasick algorithm for Transducer-based ASR

    Authors: Iuliia Thorbecke, Juan Zuluaga-Gomez, Esaú Villatoro-Tello, Andres Carofilis, Shashi Kumar, Petr Motlicek, Karthik Pandia, Aravind Ganapathiraju

    Abstract: Despite the recent success of end-to-end models for automatic speech recognition, recognizing special rare and out-of-vocabulary words, as well as fast domain adaptation with text, are still challenging. It often happens that biasing to the special entities leads to a degradation in the overall performance. We propose a light on-the-fly method to improve automatic speech recognition performance by… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

    Comments: Submitted to ICASSP2025

  4. arXiv:2209.14078  [pdf, other

    cs.SD eess.AS

    MeWEHV: Mel and Wave Embeddings for Human Voice Tasks

    Authors: Andrés Carofilis, Laura Fernández-Robles, Enrique Alegre, Eduardo Fidalgo

    Abstract: A recent trend in speech processing is the use of embeddings created through machine learning models trained on a specific task with large datasets. By leveraging the knowledge already acquired, these models can be reused in new tasks where the amount of available data is small. This paper proposes a pipeline to create a new model, called Mel and Wave Embeddings for Human Voice Tasks (MeWEHV), cap… ▽ More

    Submitted 24 June, 2023; v1 submitted 28 September, 2022; originally announced September 2022.

    Comments: Submitted to IEEE Access

  5. arXiv:2005.10086  [pdf, other

    cs.CV

    Classifying Suspicious Content in Tor Darknet

    Authors: Eduardo Fidalgo Fernandez, Roberto Andrés Vasco Carofilis, Francisco Jáñez Martino, Pablo Blanco Medina

    Abstract: One of the tasks of law enforcement agencies is to find evidence of criminal activity in the Darknet. However, visiting thousands of domains to locate visual information containing illegal acts manually requires a considerable amount of time and resources. Furthermore, the background of the images can pose a challenge when performing classification. To solve this problem, in this paper, we explore… ▽ More

    Submitted 21 May, 2020; v1 submitted 20 May, 2020; originally announced May 2020.

    Comments: To be published on the JNIC 2020 Conference. Summary of already published research