Skip to main content

Showing 1–8 of 8 results for author: Lavrentyeva, G

Searching in archive eess. Search in all archives.
.
  1. arXiv:2210.16231  [pdf, other

    cs.SD cs.LG eess.AS

    Universal speaker recognition encoders for different speech segments duration

    Authors: Sergey Novoselov, Vladimir Volokhov, Galina Lavrentyeva

    Abstract: Creating universal speaker encoders which are robust for different acoustic and speech duration conditions is a big challenge today. According to our observations systems trained on short speech segments are optimal for short phrase speaker verification and systems trained on long segments are superior for long segments verification. A system trained simultaneously on pooled short and long speech… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

    Comments: Submitted to ICASSP'23

  2. arXiv:2203.15106  [pdf, other

    cs.SD cs.LG eess.AS

    Investigation of Different Calibration Methods for Deep Speaker Embedding based Verification Systems

    Authors: Galina Lavrentyeva, Sergey Novoselov, Andrey Shulipa, Marina Volkova, Aleksandr Kozlov

    Abstract: Deep speaker embedding extractors have already become new state-of-the-art systems in the speaker verification field. However, the problem of verification score calibration for such systems often remains out of focus. An irrelevant score calibration leads to serious issues, especially in the case of unknown acoustic conditions, even if we use a strong speaker verification system in terms of thresh… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Submitted to Interspeech2022

  3. arXiv:2203.15095  [pdf, other

    cs.SD cs.LG eess.AS

    Robust Speaker Recognition with Transformers Using wav2vec 2.0

    Authors: Sergey Novoselov, Galina Lavrentyeva, Anastasia Avdeeva, Vladimir Volokhov, Aleksei Gusev

    Abstract: Recent advances in unsupervised speech representation learning discover new approaches and provide new state-of-the-art for diverse types of speech processing tasks. This paper presents an investigation of using wav2vec 2.0 deep speech representations for the speaker recognition task. The proposed fine-tuning procedure of wav2vec 2.0 with simple TDNN and statistic pooling back-end using additive a… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Submitted to Interspeech2022. arXiv admin note: text overlap with arXiv:2111.02298

  4. arXiv:2111.02298  [pdf, other

    cs.SD cs.LG eess.AS

    STC speaker recognition systems for the NIST SRE 2021

    Authors: Anastasia Avdeeva, Aleksei Gusev, Igor Korsunov, Alexander Kozlov, Galina Lavrentyeva, Sergey Novoselov, Timur Pekhovsky, Andrey Shulipa, Alisa Vinogradova, Vladimir Volokhov, Evgeny Smirnov, Vasily Galyuk

    Abstract: This paper presents a description of STC Ltd. systems submitted to the NIST 2021 Speaker Recognition Evaluation for both fixed and open training conditions. These systems consists of a number of diverse subsystems based on using deep neural networks as feature extractors. During the NIST 2021 SRE challenge we focused on the training of the state-of-the-art deep speaker embeddings extractors like R… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

  5. arXiv:2002.06033  [pdf, other

    cs.SD cs.CL eess.AS stat.ML

    Deep Speaker Embeddings for Far-Field Speaker Recognition on Short Utterances

    Authors: Aleksei Gusev, Vladimir Volokhov, Tseren Andzhukaev, Sergey Novoselov, Galina Lavrentyeva, Marina Volkova, Alice Gazizullina, Andrey Shulipa, Artem Gorlanov, Anastasia Avdeeva, Artem Ivanov, Alexander Kozlov, Timur Pekhovsky, Yuri Matveev

    Abstract: Speaker recognition systems based on deep speaker embeddings have achieved significant performance in controlled conditions according to the results obtained for early NIST SRE (Speaker Recognition Evaluation) datasets. From the practical point of view, taking into account the increased interest in virtual assistants (such as Amazon Alexa, Google Home, AppleSiri, etc.), speaker verification on sho… ▽ More

    Submitted 14 February, 2020; originally announced February 2020.

    Comments: Submitted to Odyssey 2020

  6. arXiv:1904.06093  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    STC Speaker Recognition Systems for the VOiCES From a Distance Challenge

    Authors: Sergey Novoselov, Aleksei Gusev, Artem Ivanov, Timur Pekhovsky, Andrey Shulipa, Galina Lavrentyeva, Vladimir Volokhov, Alexandr Kozlov

    Abstract: This paper presents the Speech Technology Center (STC) speaker recognition (SR) systems submitted to the VOiCES From a Distance challenge 2019. The challenge's SR task is focused on the problem of speaker recognition in single channel distant/far-field audio under noisy conditions. In this work we investigate different deep neural networks architectures for speaker embedding extraction to solve th… ▽ More

    Submitted 12 April, 2019; originally announced April 2019.

    Comments: Submitted to Interspeech 2019, Graz, Austria

  7. arXiv:1904.05576  [pdf, other

    cs.SD cs.CL cs.CR cs.LG eess.AS stat.ML

    STC Antispoofing Systems for the ASVspoof2019 Challenge

    Authors: Galina Lavrentyeva, Sergey Novoselov, Andzhukaev Tseren, Marina Volkova, Artem Gorlanov, Alexandr Kozlov

    Abstract: This paper describes the Speech Technology Center (STC) antispoofing systems submitted to the ASVspoof 2019 challenge. The ASVspoof2019 is the extended version of the previous challenges and includes 2 evaluation conditions: logical access use-case scenario with speech synthesis and voice conversion attack types and physical access use-case scenario with replay attacks. During the challenge we dev… ▽ More

    Submitted 11 April, 2019; originally announced April 2019.

    Comments: Submitted to Interspeech 2019, Graz, Austria

  8. arXiv:1803.05307  [pdf, other

    eess.AS cs.CL cs.LG cs.SD stat.ML

    Deep CNN based feature extractor for text-prompted speaker recognition

    Authors: Sergey Novoselov, Oleg Kudashev, Vadim Schemelinin, Ivan Kremnev, Galina Lavrentyeva

    Abstract: Deep learning is still not a very common tool in speaker verification field. We study deep convolutional neural network performance in the text-prompted speaker verification task. The prompted passphrase is segmented into word states - i.e. digits -to test each digit utterance separately. We train a single high-level feature extractor for all states and use cosine similarity metric for scoring. Th… ▽ More

    Submitted 13 March, 2018; originally announced March 2018.

    Comments: Submitted to ICASSP 2018