Towards Robust Voice Pathology Detection

Harar, Pavol; Galaz, Zoltan; Alonso-Hernandez, Jesus B.; Mekyska, Jiri; Burget, Radim; Smekal, Zdenek

doi:10.1007/s00521-018-3464-7

Computer Science > Sound

arXiv:1907.06129 (cs)

[Submitted on 13 Jul 2019]

Title:Towards Robust Voice Pathology Detection

Authors:Pavol Harar, Zoltan Galaz, Jesus B. Alonso-Hernandez, Jiri Mekyska, Radim Burget, Zdenek Smekal

View PDF

Abstract:Automatic objective non-invasive detection of pathological voice based on computerized analysis of acoustic signals can play an important role in early diagnosis, progression tracking and even effective treatment of pathological voices. In search towards such a robust voice pathology detection system we investigated 3 distinct classifiers within supervised learning and anomaly detection paradigms. We conducted a set of experiments using a variety of input data such as raw waveforms, spectrograms, mel-frequency cepstral coefficients (MFCC) and conventional acoustic (dysphonic) features (AF). In comparison with previously published works, this article is the first to utilize combination of 4 different databases comprising normophonic and pathological recordings of sustained phonation of the vowel /a/ unrestricted to a subset of vocal pathologies. Furthermore, to our best knowledge, this article is the first to explore gradient boosted trees and deep learning for this application. The following best classification performances measured by F1 score on dedicated test set were achieved: XGBoost (0.733) using AF and MFCC, DenseNet (0.621) using MFCC, and Isolation Forest (0.610) using AF. Even though these results are of exploratory character, conducted experiments do show promising potential of gradient boosting and deep learning methods to robustly detect voice pathologies.

Comments:	11 pages, 1 figure, 10 tables. Keywords: Voice pathology detection, deep learning, gradient boosting, anomaly detection
Subjects:	Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:1907.06129 [cs.SD]
	(or arXiv:1907.06129v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.1907.06129
Journal reference:	Neural Computing and Applications (2018): 1-11
Related DOI:	https://doi.org/10.1007/s00521-018-3464-7

Submission history

From: Pavol Harar [view email]
[v1] Sat, 13 Jul 2019 21:09:40 UTC (419 KB)

Computer Science > Sound

Title:Towards Robust Voice Pathology Detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Towards Robust Voice Pathology Detection

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators