A multi-modal approach for identifying schizophrenia using cross-modal attention

Premananth, Gowtham; Siriwardena, Yashish M.; Resnik, Philip; Espy-Wilson, Carol

Electrical Engineering and Systems Science > Signal Processing

arXiv:2309.15136 (eess)

[Submitted on 26 Sep 2023 (v1), last revised 18 Apr 2024 (this version, v3)]

Title:A multi-modal approach for identifying schizophrenia using cross-modal attention

Authors:Gowtham Premananth, Yashish M.Siriwardena, Philip Resnik, Carol Espy-Wilson

View PDF HTML (experimental)

Abstract:This study focuses on how different modalities of human communication can be used to distinguish between healthy controls and subjects with schizophrenia who exhibit strong positive symptoms. We developed a multi-modal schizophrenia classification system using audio, video, and text. Facial action units and vocal tract variables were extracted as low-level features from video and audio respectively, which were then used to compute high-level coordination features that served as the inputs to the audio and video modalities. Context-independent text embeddings extracted from transcriptions of speech were used as the input for the text modality. The multi-modal system is developed by fusing a segment-to-session-level classifier for video and audio modalities with a text model based on a Hierarchical Attention Network (HAN) with cross-modal attention. The proposed multi-modal system outperforms the previous state-of-the-art multi-modal system by 8.53% in the weighted average F1 score.

Comments:	Accepted to Annual International Conference of the IEEE Engineering in Medicine and Biology Society 2024
Subjects:	Signal Processing (eess.SP); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
Cite as:	arXiv:2309.15136 [eess.SP]
	(or arXiv:2309.15136v3 [eess.SP] for this version)
	https://doi.org/10.48550/arXiv.2309.15136

Submission history

From: Gowtham Premananth [view email]
[v1] Tue, 26 Sep 2023 14:28:16 UTC (388 KB)
[v2] Fri, 2 Feb 2024 19:07:40 UTC (1,084 KB)
[v3] Thu, 18 Apr 2024 21:33:28 UTC (1,084 KB)

Electrical Engineering and Systems Science > Signal Processing

Title:A multi-modal approach for identifying schizophrenia using cross-modal attention

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Signal Processing

Title:A multi-modal approach for identifying schizophrenia using cross-modal attention

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators