Speech Emotion Recognition using Semantic Information

Tzirakis, Panagiotis; Nguyen, Anh; Zafeiriou, Stefanos; Schuller, Björn W.

Computer Science > Sound

arXiv:2103.02993 (cs)

[Submitted on 4 Mar 2021]

Title:Speech Emotion Recognition using Semantic Information

Authors:Panagiotis Tzirakis, Anh Nguyen, Stefanos Zafeiriou, Björn W. Schuller

View PDF

Abstract:Speech emotion recognition is a crucial problem manifesting in a multitude of applications such as human computer interaction and education. Although several advancements have been made in the recent years, especially with the advent of Deep Neural Networks (DNN), most of the studies in the literature fail to consider the semantic information in the speech signal. In this paper, we propose a novel framework that can capture both the semantic and the paralinguistic information in the signal. In particular, our framework is comprised of a semantic feature extractor, that captures the semantic information, and a paralinguistic feature extractor, that captures the paralinguistic information. Both semantic and paraliguistic features are then combined to a unified representation using a novel attention mechanism. The unified feature vector is passed through a LSTM to capture the temporal dynamics in the signal, before the final prediction. To validate the effectiveness of our framework, we use the popular SEWA dataset of the AVEC challenge series and compare with the three winning papers. Our model provides state-of-the-art results in the valence and liking dimensions.

Comments:	ICASSP 2021
Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2103.02993 [cs.SD]
	(or arXiv:2103.02993v1 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2103.02993

Submission history

From: Panagiotis Tzirakis [view email]
[v1] Thu, 4 Mar 2021 12:34:25 UTC (76 KB)

Computer Science > Sound

Title:Speech Emotion Recognition using Semantic Information

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Speech Emotion Recognition using Semantic Information

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators