Skip to main content

Showing 1–3 of 3 results for author: Trong, T N

Searching in archive eess. Search in all archives.
.
  1. arXiv:2109.13510  [pdf, other

    cs.LG cs.CL cs.SD eess.AS

    VoxCeleb Enrichment for Age and Gender Recognition

    Authors: Khaled Hechmi, Trung Ngo Trong, Ville Hautamaki, Tomi Kinnunen

    Abstract: VoxCeleb datasets are widely used in speaker recognition studies. Our work serves two purposes. First, we provide speaker age labels and (an alternative) annotation of speaker gender. Second, we demonstrate the use of this metadata by constructing age and gender recognition models with different features and classifiers. We query different celebrity databases and apply consensus rules to derive ag… ▽ More

    Submitted 20 December, 2021; v1 submitted 28 September, 2021; originally announced September 2021.

    Comments: Accepted for presentation at ASRU 2021; repository: https://github.com/hechmik/voxceleb_enrichment_age_gender

  2. arXiv:1907.03164  [pdf, other

    cs.LG eess.AS stat.ML

    Towards Debugging Deep Neural Networks by Generating Speech Utterances

    Authors: Bilal Soomro, Anssi Kanervisto, Trung Ngo Trong, Ville Hautamäki

    Abstract: Deep neural networks (DNN) are able to successfully process and classify speech utterances. However, understanding the reason behind a classification by DNN is difficult. One such debugging method used with image classification DNNs is activation maximization, which generates example-images that are classified as one of the classes. In this work, we evaluate applicability of this method to speech… ▽ More

    Submitted 6 July, 2019; originally announced July 2019.

    Comments: Accepted to Interspeech 2019

  3. arXiv:1904.07386  [pdf, other

    eess.AS cs.CL cs.SD

    I4U Submission to NIST SRE 2018: Leveraging from a Decade of Shared Experiences

    Authors: Kong Aik Lee, Ville Hautamaki, Tomi Kinnunen, Hitoshi Yamamoto, Koji Okabe, Ville Vestman, Jing Huang, Guohong Ding, Hanwu Sun, Anthony Larcher, Rohan Kumar Das, Haizhou Li, Mickael Rouvier, Pierre-Michel Bousquet, Wei Rao, Qing Wang, Chunlei Zhang, Fahimeh Bahmaninezhad, Hector Delgado, Jose Patino, Qiongqiong Wang, Ling Guo, Takafumi Koshinaka, Jiacen Zhang, Koichi Shinoda , et al. (21 additional authors not shown)

    Abstract: The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary objective of the current paper is to summarize the res… ▽ More

    Submitted 15 April, 2019; originally announced April 2019.

    Comments: 5 pages