CIEM: Contrastive Instruction Evaluation Method for Better Instruction Tuning

Hu, Hongyu; Zhang, Jiyuan; Zhao, Minyi; Sun, Zhenbang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2309.02301 (cs)

[Submitted on 5 Sep 2023 (v1), last revised 24 Nov 2023 (this version, v2)]

Title:CIEM: Contrastive Instruction Evaluation Method for Better Instruction Tuning

Authors:Hongyu Hu, Jiyuan Zhang, Minyi Zhao, Zhenbang Sun

View PDF

Abstract:Nowadays, the research on Large Vision-Language Models (LVLMs) has been significantly promoted thanks to the success of Large Language Models (LLM). Nevertheless, these Vision-Language Models (VLMs) are suffering from the drawback of hallucination -- due to insufficient understanding of vision and language modalities, VLMs may generate incorrect perception information when doing downstream applications, for example, captioning a non-existent entity. To address the hallucination phenomenon, on the one hand, we introduce a Contrastive Instruction Evaluation Method (CIEM), which is an automatic pipeline that leverages an annotated image-text dataset coupled with an LLM to generate factual/contrastive question-answer pairs for the evaluation of the hallucination of VLMs. On the other hand, based on CIEM, we further propose a new instruction tuning method called CIT (the abbreviation of Contrastive Instruction Tuning) to alleviate the hallucination of VLMs by automatically producing high-quality factual/contrastive question-answer pairs and corresponding justifications for model tuning. Through extensive experiments on CIEM and CIT, we pinpoint the hallucination issues commonly present in existing VLMs, the disability of the current instruction-tuning dataset to handle the hallucination phenomenon and the superiority of CIT-tuned VLMs over both CIEM and public datasets.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2309.02301 [cs.CV]
	(or arXiv:2309.02301v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2309.02301

Submission history

From: Hongyu Hu [view email]
[v1] Tue, 5 Sep 2023 15:06:37 UTC (2,168 KB)
[v2] Fri, 24 Nov 2023 07:07:03 UTC (2,915 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CIEM: Contrastive Instruction Evaluation Method for Better Instruction Tuning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:CIEM: Contrastive Instruction Evaluation Method for Better Instruction Tuning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators