close this message
arXiv smileybones

arXiv Is Hiring a DevOps Engineer

Work on one of the world's most important websites and make an impact on open science.

View Jobs
Skip to main content
Cornell University

arXiv Is Hiring a DevOps Engineer

View Jobs
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > q-bio > arXiv:2401.17320

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Quantitative Biology > Neurons and Cognition

arXiv:2401.17320 (q-bio)
[Submitted on 27 Jan 2024]

Title:Sigma-lognormal modeling of speech

Authors:C. Carmona-Duarte, M.A.Ferrer, R. Plamondon, A. Gomez-Rodellar, P. Gomez-Vilda
View a PDF of the paper titled Sigma-lognormal modeling of speech, by C. Carmona-Duarte and 4 other authors
View PDF
Abstract:Human movement studies and analyses have been fundamental in many scientific domains, ranging from neuroscience to education, pattern recognition to robotics, health care to sports, and beyond. Previous speech motor models were proposed to understand how speech movement is produced and how the resulting speech varies when some parameters are changed. However, the inverse approach, in which the muscular response parameters and the subject's age are derived from real continuous speech, is not possible with such models. Instead, in the handwriting field, the kinematic theory of rapid human movements and its associated Sigma-lognormal model have been applied successfully to obtain the muscular response parameters. This work presents a speech kinematics based model that can be used to study, analyze, and reconstruct complex speech kinematics in a simplified manner. A method based on the kinematic theory of rapid human movements and its associated Sigma lognormal model are applied to describe and to parameterize the asymptotic impulse response of the neuromuscular networks involved in speech as a response to a neuromotor command. The method used to carry out transformations from formants to a movement observation is also presented. Experiments carried out with the (English) VTR TIMIT database and the (German) Saarbrucken Voice Database, including people of different ages, with and without laryngeal pathologies, corroborate the link between the extracted parameters and aging, on the one hand, and the proportion between the first and second formants required in applying the kinematic theory of rapid human movements, on the other. The results should drive innovative developments in the modeling and understanding of speech kinematics.
Comments: Published in Open Acces
Subjects: Neurons and Cognition (q-bio.NC); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as: arXiv:2401.17320 [q-bio.NC]
  (or arXiv:2401.17320v1 [q-bio.NC] for this version)
  https://doi.org/10.48550/arXiv.2401.17320
arXiv-issued DOI via DataCite
Journal reference: Cognitive Computation, 13(2). pp. 488-503, 2021
Related DOI: https://doi.org/10.1007/s12559-020-09803-8
DOI(s) linking to related resources

Submission history

From: Cristina Carmona-Duarte [view email]
[v1] Sat, 27 Jan 2024 18:00:20 UTC (2,295 KB)
Full-text links:

Access Paper:

    View a PDF of the paper titled Sigma-lognormal modeling of speech, by C. Carmona-Duarte and 4 other authors
  • View PDF
  • Other Formats
license icon view license
Current browse context:
q-bio.NC
< prev   |   next >
new | recent | 2024-01
Change to browse by:
cs
cs.CV
cs.SD
eess
eess.AS
q-bio

References & Citations

  • NASA ADS
  • Google Scholar
  • Semantic Scholar
a export BibTeX citation Loading...

BibTeX formatted citation

×
Data provided by:

Bookmark

BibSonomy logo Reddit logo

Bibliographic and Citation Tools

Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)

Code, Data and Media Associated with this Article

alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)

Demos

Replicate (What is Replicate?)
Hugging Face Spaces (What is Spaces?)
TXYZ.AI (What is TXYZ.AI?)

Recommenders and Search Tools

Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
  • Author
  • Venue
  • Institution
  • Topic

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack