Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic

Alwajih, Fakhraddin; Bhatia, Gagan; Abdul-Mageed, Muhammad

Computer Science > Computation and Language

arXiv:2407.18129 (cs)

[Submitted on 25 Jul 2024 (v1), last revised 26 Jul 2024 (this version, v2)]

Title:Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic

Authors:Fakhraddin Alwajih, Gagan Bhatia, Muhammad Abdul-Mageed

View PDF HTML (experimental)

Abstract:Recent advancements have significantly enhanced the capabilities of Multimodal Large Language Models (MLLMs) in generating and understanding image-to-text content. Despite these successes, progress is predominantly limited to English due to the scarcity of high quality multimodal resources in other languages. This limitation impedes the development of competitive models in languages such as Arabic. To alleviate this situation, we introduce an efficient Arabic multimodal assistant, dubbed Dallah, that utilizes an advanced language model based on LLaMA-2 to facilitate multimodal interactions. Dallah demonstrates state-of-the-art performance in Arabic MLLMs. Through fine-tuning six Arabic dialects, Dallah showcases its capability to handle complex dialectal interactions incorporating both textual and visual elements. The model excels in two benchmark tests: one evaluating its performance on Modern Standard Arabic (MSA) and another specifically designed to assess dialectal responses. Beyond its robust performance in multimodal interaction tasks, Dallah has the potential to pave the way for further development of dialect-aware Arabic MLLMs.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2407.18129 [cs.CL]
	(or arXiv:2407.18129v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2407.18129

Submission history

From: Gagan Bhatia [view email]
[v1] Thu, 25 Jul 2024 15:36:48 UTC (26,497 KB)
[v2] Fri, 26 Jul 2024 15:34:12 UTC (26,497 KB)

Computer Science > Computation and Language

Title:Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators