Voice Pathology Detection Using Deep Learning: a Preliminary Study

Harar, Pavol; Alonso-Hernandez, Jesus B.; Mekyska, Jiri; Galaz, Zoltan; Burget, Radim; Smekal, Zdenek

doi:10.1109/IWOBI.2017.7985525

Electrical Engineering and Systems Science > Audio and Speech Processing

arXiv:1907.05905 (eess)

[Submitted on 12 Jul 2019]

Title:Voice Pathology Detection Using Deep Learning: a Preliminary Study

Authors:Pavol Harar, Jesus B. Alonso-Hernandez, Jiri Mekyska, Zoltan Galaz, Radim Burget, Zdenek Smekal

View PDF

Abstract:This paper describes a preliminary investigation of Voice Pathology Detection using Deep Neural Networks (DNN). We used voice recordings of sustained vowel /a/ produced at normal pitch from German corpus Saarbruecken Voice Database (SVD). This corpus contains voice recordings and electroglottograph signals of more than 2 000 speakers. The idea behind this experiment is the use of convolutional layers in combination with recurrent Long-Short-Term-Memory (LSTM) layers on raw audio signal. Each recording was split into 64 ms Hamming windowed segments with 30 ms overlap. Our trained model achieved 71.36% accuracy with 65.04% sensitivity and 77.67% specificity on 206 validation files and 68.08% accuracy with 66.75% sensitivity and 77.89% specificity on 874 testing files. This is a promising result in favor of this approach because it is comparable to similar previously published experiment that used different methodology. Further investigation is needed to achieve the state-of-the-art results.

Comments:	4 pages, 1 figure, 5 tables
Subjects:	Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Sound (cs.SD)
Cite as:	arXiv:1907.05905 [eess.AS]
	(or arXiv:1907.05905v1 [eess.AS] for this version)
	https://doi.org/10.48550/arXiv.1907.05905
Journal reference:	In 2017 international conference and workshop on bioinspired intelligence (IWOBI), pp. 1-4. IEEE, 2017
Related DOI:	https://doi.org/10.1109/IWOBI.2017.7985525

Submission history

From: Pavol Harar [view email]
[v1] Fri, 12 Jul 2019 18:06:02 UTC (1,035 KB)

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Voice Pathology Detection Using Deep Learning: a Preliminary Study

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Audio and Speech Processing

Title:Voice Pathology Detection Using Deep Learning: a Preliminary Study

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators