Skip to main content

Showing 1–2 of 2 results for author: Barahona, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2505.15320  [pdf, ps, other

    eess.AS cs.SD

    Analysis of ABC Frontend Audio Systems for the NIST-SRE24

    Authors: Sara Barahona, Anna Silnova, Ladislav Mošner, Junyi Peng, Oldřich Plchot, Johan Rohdin, Lin Zhang, Jiangyu Han, Petr Palka, Federico Landini, Lukáš Burget, Themos Stafylakis, Sandro Cumani, Dominik Boboš, Miroslav Hlavaček, Martin Kodovsky, Tomáš Pavlíček

    Abstract: We present a comprehensive analysis of the embedding extractors (frontends) developed by the ABC team for the audio track of NIST SRE 2024. We follow the two scenarios imposed by NIST: using only a provided set of telephone recordings for training (fixed) or adding publicly available data (open condition). Under these constraints, we develop the best possible speaker embedding extractors for the p… ▽ More

    Submitted 21 May, 2025; originally announced May 2025.

    Comments: Accepted at Interspeech 2025

  2. arXiv:2410.02364  [pdf, ps, other

    eess.AS

    State-of-the-art Embeddings with Video-free Segmentation of the Source VoxCeleb Data

    Authors: Sara Barahona, Ladislav Mošner, Themos Stafylakis, Oldřich Plchot, Junyi Peng, Lukáš Burget, Jan Černocký

    Abstract: In this paper, we refine and validate our method for training speaker embedding extractors using weak annotations. More specifically, we use only the audio stream of the source VoxCeleb videos and the names of the celebrities without knowing the time intervals in which they appear in the recording. We experiment with hyperparameters and embedding extractors based on ResNet and WavLM. We show that… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

    Comments: This work has been submitted to the IEEE for possible publication