Skip to main content

Showing 1–5 of 5 results for author: Suni, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.08564  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Neighbors and relatives: How do speech embeddings reflect linguistic connections across the world?

    Authors: Tuukka Törö, Antti Suni, Juraj Šimko

    Abstract: Investigating linguistic relationships on a global scale requires analyzing diverse features such as syntax, phonology and prosody, which evolve at varying rates influenced by internal diversification, language contact, and sociolinguistic factors. Recent advances in machine learning (ML) offer complementary alternatives to traditional historical and typological approaches. Instead of relying on e… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: 27 pages, 11 figures (+5 supplementary), submitted to PLOS One

  2. arXiv:2306.09814  [pdf, other

    eess.AS cs.CL

    Investigating the Utility of Surprisal from Large Language Models for Speech Synthesis Prosody

    Authors: Sofoklis Kakouros, Juraj Šimko, Martti Vainio, Antti Suni

    Abstract: This paper investigates the use of word surprisal, a measure of the predictability of a word in a given context, as a feature to aid speech synthesis prosody. We explore how word surprisal extracted from large language models (LLMs) correlates with word prominence, a signal-based measure of the salience of a word in a given discourse. We also examine how context length and LLM size affect the resu… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

    Comments: Accepted at SSW 2023

  3. Prosodic Prominence and Boundaries in Sequence-to-Sequence Speech Synthesis

    Authors: Antti Suni, Sofoklis Kakouros, Martti Vainio, Juraj Šimko

    Abstract: Recent advances in deep learning methods have elevated synthetic speech quality to human level, and the field is now moving towards addressing prosodic variation in synthetic speech.Despite successes in this effort, the state-of-the-art systems fall short of faithfully reproducing local prosodic events that give rise to, e.g., word-level emphasis and phrasal structure. This type of prosodic variat… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

  4. arXiv:1908.02262  [pdf, other

    cs.CL

    Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations

    Authors: Aarne Talman, Antti Suni, Hande Celikkanat, Sofoklis Kakouros, Jörg Tiedemann, Martti Vainio

    Abstract: In this paper we introduce a new natural language processing dataset and benchmark for predicting prosodic prominence from written text. To our knowledge this will be the largest publicly available dataset with prosodic labels. We describe the dataset construction and the resulting benchmark dataset in detail and train a number of different models ranging from feature-based classifiers to neural n… ▽ More

    Submitted 6 August, 2019; originally announced August 2019.

    Comments: NoDaLiDa 2019 camera ready

  5. arXiv:1510.01949  [pdf, other

    cs.CL cs.SD

    Hierarchical Representation of Prosody for Statistical Speech Synthesis

    Authors: Antti Suni, Daniel Aalto, Martti Vainio

    Abstract: Prominences and boundaries are the essential constituents of prosodic structure in speech. They provide for means to chunk the speech stream into linguistically relevant units by providing them with relative saliences and demarcating them within coherent utterance structures. Prominences and boundaries have both been widely used in both basic research on prosody as well as in text-to-speech synthe… ▽ More

    Submitted 7 October, 2015; originally announced October 2015.

    Comments: 22 pages, 5 figures