CAMEL-Bench: A Comprehensive Arabic LMM Benchmark

Ghaboura, Sara; Heakl, Ahmed; Thawakar, Omkar; Alharthi, Ali; Riahi, Ines; Saif, Abduljalil; Laaksonen, Jorma; Khan, Fahad S.; Khan, Salman; Anwer, Rao M.

Computer Science > Computer Vision and Pattern Recognition

arXiv:2410.18976 (cs)

[Submitted on 24 Oct 2024]

Title:CAMEL-Bench: A Comprehensive Arabic LMM Benchmark

Authors:Sara Ghaboura, Ahmed Heakl, Omkar Thawakar, Ali Alharthi, Ines Riahi, Abduljalil Saif, Jorma Laaksonen, Fahad S. Khan, Salman Khan, Rao M. Anwer

View PDF HTML (experimental)

Abstract:Recent years have witnessed a significant interest in developing large multimodal models (LMMs) capable of performing various visual reasoning and understanding tasks. This has led to the introduction of multiple LMM benchmarks to evaluate LMMs on different tasks. However, most existing LMM evaluation benchmarks are predominantly English-centric. In this work, we develop a comprehensive LMM evaluation benchmark for the Arabic language to represent a large population of over 400 million speakers. The proposed benchmark, named CAMEL-Bench, comprises eight diverse domains and 38 sub-domains including, multi-image understanding, complex visual perception, handwritten document understanding, video understanding, medical imaging, plant diseases, and remote sensing-based land use understanding to evaluate broad scenario generalizability. Our CAMEL-Bench comprises around 29,036 questions that are filtered from a larger pool of samples, where the quality is manually verified by native speakers to ensure reliable model assessment. We conduct evaluations of both closed-source, including GPT-4 series, and open-source LMMs. Our analysis reveals the need for substantial improvement, especially among the best open-source models, with even the closed-source GPT-4o achieving an overall score of 62%. Our benchmark and evaluation scripts are open-sourced.

Comments:	10 pages, 5 figures, NAACL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computers and Society (cs.CY); Machine Learning (cs.LG)
Cite as:	arXiv:2410.18976 [cs.CV]
	(or arXiv:2410.18976v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2410.18976

Submission history

From: Ahmed Heakl Mr [view email]
[v1] Thu, 24 Oct 2024 17:59:38 UTC (22,551 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CAMEL-Bench: A Comprehensive Arabic LMM Benchmark

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:CAMEL-Bench: A Comprehensive Arabic LMM Benchmark

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators