Skip to main content

Showing 1–4 of 4 results for author: Zarazaga, P P

Searching in archive eess. Search in all archives.
.
  1. arXiv:2409.14823  [pdf, other

    cs.SD eess.AS

    HiFi-Glot: Neural Formant Synthesis with Differentiable Resonant Filters

    Authors: Lauri Juvela, Pablo Pérez Zarazaga, Gustav Eje Henter, Zofia Malisz

    Abstract: We introduce an end-to-end neural speech synthesis system that uses the source-filter model of speech production. Specifically, we apply differentiable resonant filters to a glottal waveform generated by a neural vocoder. The aim is to obtain a controllable synthesiser, similar to classic formant synthesis, but with much higher perceptual quality - filling a research gap in current neural waveform… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

    Comments: Submitted to ICASSP 2025

  2. arXiv:2307.03168  [pdf, other

    eess.AS

    Recovering implicit pitch contours from formants in whispered speech

    Authors: Pablo Pérez Zarazaga, Zofia Malisz

    Abstract: Whispered speech is characterised by a noise-like excitation that results in the lack of fundamental frequency. Considering that prosodic phenomena such as intonation are perceived through f0 variation, the perception of whispered prosody is relatively difficult. At the same time, studies have shown that speakers do attempt to produce intonation when whispering and that prosodic variability is bei… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: 5 pages, 3 figures, 2 tables, Accepted at ICPhS 2023

  3. arXiv:2306.01957  [pdf, other

    eess.AS

    Speaker-independent neural formant synthesis

    Authors: Pablo Pérez Zarazaga, Zofia Malisz, Gustav Eje Henter, Lauri Juvela

    Abstract: We describe speaker-independent speech synthesis driven by a small set of phonetically meaningful speech parameters such as formant frequencies. The intention is to leverage deep-learning advances to provide a highly realistic signal generator that includes control affordances required for stimulus creation in the speech sciences. Our approach turns input speech parameters into predicted mel-spect… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: 5 pages, 4 figures. Article accepted at INTERSPEECH 2023

  4. arXiv:2303.07442  [pdf, other

    eess.AS cs.SD

    A processing framework to access large quantities of whispered speech found in ASMR

    Authors: Pablo Perez Zarazaga, Gustav Eje Henter, Zofia Malisz

    Abstract: Whispering is a ubiquitous mode of communication that humans use daily. Despite this, whispered speech has been poorly served by existing speech technology due to a shortage of resources and processing methodology. To remedy this, this paper provides a processing framework that enables access to large and unique data of high-quality whispered speech. We obtain the data from recordings submitted to… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: Accepted at ICASSP 2023, 5 pages, 2 figures, 2 tables