JMI at SemEval 2024 Task 3: Two-step approach for multimodal ECAC using in-context learning with GPT and instruction-tuned Llama models

Arefa; Ansari, Mohammed Abbas; Saxena, Chandni; Ahmad, Tanvir

Computer Science > Computation and Language

arXiv:2403.04798 (cs)

[Submitted on 5 Mar 2024 (v1), last revised 2 Apr 2024 (this version, v2)]

Title:JMI at SemEval 2024 Task 3: Two-step approach for multimodal ECAC using in-context learning with GPT and instruction-tuned Llama models

Authors:Arefa, Mohammed Abbas Ansari, Chandni Saxena, Tanvir Ahmad

View PDF HTML (experimental)

Abstract:This paper presents our system development for SemEval-2024 Task 3: "The Competition of Multimodal Emotion Cause Analysis in Conversations". Effectively capturing emotions in human conversations requires integrating multiple modalities such as text, audio, and video. However, the complexities of these diverse modalities pose challenges for developing an efficient multimodal emotion cause analysis (ECA) system. Our proposed approach addresses these challenges by a two-step framework. We adopt two different approaches in our implementation. In Approach 1, we employ instruction-tuning with two separate Llama 2 models for emotion and cause prediction. In Approach 2, we use GPT-4V for conversation-level video description and employ in-context learning with annotated conversation using GPT 3.5. Our system wins rank 4, and system ablation experiments demonstrate that our proposed solutions achieve significant performance gains. All the experimental codes are available on Github.

Comments:	Paper Accepted at SemEval 2024
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2403.04798 [cs.CL]
	(or arXiv:2403.04798v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2403.04798

Submission history

From: Chandni Saxena [view email]
[v1] Tue, 5 Mar 2024 12:07:18 UTC (4,062 KB)
[v2] Tue, 2 Apr 2024 14:52:37 UTC (6,784 KB)

Computer Science > Computation and Language

Title:JMI at SemEval 2024 Task 3: Two-step approach for multimodal ECAC using in-context learning with GPT and instruction-tuned Llama models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:JMI at SemEval 2024 Task 3: Two-step approach for multimodal ECAC using in-context learning with GPT and instruction-tuned Llama models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators