Skip to main content

Showing 1–2 of 2 results for author: Kwasny, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2012.01551  [pdf, other

    eess.AS cs.SD

    Joint gender and age estimation based on speech signals using x-vectors and transfer learning

    Authors: Damian Kwasny, Daria Hemmerling

    Abstract: In this paper we extend the x-vector framework for the task of speaker's age estimation and gender classification. In particular, we replace the baseline multilayer-TDNN architecture with QuartzNet, a convolutional architecture that has gained success in the field of speech recognition. We further propose a two-staged transfer learning scheme, utilizing large scale speech datasets: VoxCeleb and Co… ▽ More

    Submitted 2 December, 2020; originally announced December 2020.

  2. arXiv:1811.01609  [pdf, ps, other

    cs.SD cs.LG eess.AS stat.ML

    ConvS2S-VC: Fully convolutional sequence-to-sequence voice conversion

    Authors: Hirokazu Kameoka, Kou Tanaka, Damian Kwasny, Takuhiro Kaneko, Nobukatsu Hojo

    Abstract: This paper proposes a voice conversion (VC) method using sequence-to-sequence (seq2seq or S2S) learning, which flexibly converts not only the voice characteristics but also the pitch contour and duration of input speech. The proposed method, called ConvS2S-VC, has three key features. First, it uses a model with a fully convolutional architecture. This is particularly advantageous in that it is sui… ▽ More

    Submitted 6 October, 2020; v1 submitted 5 November, 2018; originally announced November 2018.

    Comments: Published in IEEE/ACM Trans. ASLP https://ieeexplore.ieee.org/document/9113442