Search | arXiv e-print repository

doi 10.1016/j.jvoice.2025.03.028

Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature

Authors: Jan Vrba, Jakub Steinbach, Tomáš Jirsa, Laura Verde, Roberta De Fazio, Yuwen Zeng, Kei Ichiji, Lukáš Hájek, Zuzana Sedláková, Zuzana Urbániová, Martin Chovanec, Jan Mareš, Noriyasu Homma

Abstract: Purpose: We introduce a novel methodology for voice pathology detection using the publicly available Saarbrücken Voice Database (SVD) and a robust feature set combining commonly used acoustic handcrafted features with two novel ones: pitch difference (relative variation in fundamental frequency) and NaN feature (failed fundamental frequency estimation). Methods: We evaluate six machine learning… ▽ More Purpose: We introduce a novel methodology for voice pathology detection using the publicly available Saarbrücken Voice Database (SVD) and a robust feature set combining commonly used acoustic handcrafted features with two novel ones: pitch difference (relative variation in fundamental frequency) and NaN feature (failed fundamental frequency estimation). Methods: We evaluate six machine learning (ML) algorithms -- support vector machine, k-nearest neighbors, naive Bayes, decision tree, random forest, and AdaBoost -- using grid search for feasible hyperparameters and 20480 different feature subsets. Top 1000 classification models -- feature subset combinations for each ML algorithm are validated with repeated stratified cross-validation. To address class imbalance, we apply K-Means SMOTE to augment the training data. Results: Our approach achieves 85.61%, 84.69% and 85.22% unweighted average recall (UAR) for females, males and combined results respectively. We intentionally omit accuracy as it is a highly biased metric for imbalanced data. Conclusion: Our study demonstrates that by following the proposed methodology and feature engineering, there is a potential in detection of various voice pathologies using ML models applied to the simplest vocal task, a sustained utterance of the vowel /a:/. To enable easier use of our methodology and to support our claims, we provide a publicly available GitHub repository with DOI 10.5281/zenodo.13771573. Finally, we provide a REFORMS checklist to enhance readability, reproducibility and justification of our approach △ Less

Submitted 14 March, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

Comments: Code repository: https://github.com/aailab-uct/Automated-Robust-and-Reproducible-Voice-Pathology-Detection, Supplementary materials: https://doi.org/10.5281/zenodo.14793017

arXiv:2111.12877 [pdf]

doi 10.1109/TNNLS.2021.3123533

A Letter on Convergence of In-Parameter-Linear Nonlinear Neural Architectures with Gradient Learnings

Authors: Ivo Bukovsky, Gejza Dohnal, Peter M. Benes, Kei Ichiji, Noriyasu Homma

Abstract: This letter summarizes and proves the concept of bounded-input bounded-state (BIBS) stability for weight convergence of a broad family of in-parameter-linear nonlinear neural architectures as it generally applies to a broad family of incremental gradient learning algorithms. A practical BIBS convergence condition results from the derived proofs for every individual learning point or batches for re… ▽ More This letter summarizes and proves the concept of bounded-input bounded-state (BIBS) stability for weight convergence of a broad family of in-parameter-linear nonlinear neural architectures as it generally applies to a broad family of incremental gradient learning algorithms. A practical BIBS convergence condition results from the derived proofs for every individual learning point or batches for real-time applications. △ Less

Submitted 24 November, 2021; originally announced November 2021.

Comments: IEEE Trans. Neural Netw. Learn. Syst., Early Access

arXiv:1606.07149 [pdf]

doi 10.1109/TNNLS.2016.2572310

An Approach to Stable Gradient Descent Adaptation of Higher-Order Neural Units

Authors: Ivo Bukovsky, Noriyasu Homma

Abstract: Stability evaluation of a weight-update system of higher-order neural units (HONUs) with polynomial aggregation of neural inputs (also known as classes of polynomial neural networks) for adaptation of both feedforward and recurrent HONUs by a gradient descent method is introduced. An essential core of the approach is based on spectral radius of a weight-update system, and it allows stability monit… ▽ More Stability evaluation of a weight-update system of higher-order neural units (HONUs) with polynomial aggregation of neural inputs (also known as classes of polynomial neural networks) for adaptation of both feedforward and recurrent HONUs by a gradient descent method is introduced. An essential core of the approach is based on spectral radius of a weight-update system, and it allows stability monitoring and its maintenance at every adaptation step individually. Assuring stability of the weight-update system (at every single adaptation step) naturally results in adaptation stability of the whole neural architecture that adapts to target data. As an aside, the used approach highlights the fact that the weight optimization of HONU is a linear problem, so the proposed approach can be generally extended to any neural architecture that is linear in its adaptable parameters. △ Less

Submitted 22 June, 2016; originally announced June 2016.

Comments: 2016, 13 pages

Journal ref: IEEE Transactions on Neural Networks and Learning Systems,ISSN: 2162-237X,2016

Showing 1–3 of 3 results for author: Homma, N