MaskCD: Mitigating LVLM Hallucinations by Image Head Masked Contrastive Decoding

Deng, Jingyuan; Yang, Yujiu

Computer Science > Computer Vision and Pattern Recognition

arXiv:2510.02790 (cs)

[Submitted on 3 Oct 2025]

Title:MaskCD: Mitigating LVLM Hallucinations by Image Head Masked Contrastive Decoding

Authors:Jingyuan Deng, Yujiu Yang

View PDF HTML (experimental)

Abstract:Large vision-language models (LVLMs) have shown remarkable performance in visual-language understanding for downstream multimodal tasks. While their capabilities are improving, problems emerge simultaneously. Among those problems, the hallucinations have attracted much attention, which stands for the phenomenon where LVLMs generate contradictory content to their input visual and text contents. Many approaches have been proposed to deal with this issue, such as contrastive decoding and attention manipulation. However, contrastive decoding methods struggle in constructing appropriate contrastive samples, and attention manipulation methods are highly sensitive, lacking stability. In this work, we propose image head Masked Contrastive Decoding (MaskCD). Our approach utilizes the "image heads" in LVLMs, masking them to construct contrastive samples for contrastive decoding. We evaluated MaskCD on LLaVA-1.5-7b and Qwen-VL-7b, using various benchmarks such as CHAIR, POPE, AMBER and MME. The results demonstrate that MaskCD effectively alleviates the phenomenon of hallucinations and retains the general capabilities of LVLMs. Corresponding resources could be found at: this https URL .

Comments:	accepted to emnlp2025 findings
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Multimedia (cs.MM)
Cite as:	arXiv:2510.02790 [cs.CV]
	(or arXiv:2510.02790v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2510.02790

Submission history

From: Jingyuan Deng [view email]
[v1] Fri, 3 Oct 2025 07:59:16 UTC (321 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:MaskCD: Mitigating LVLM Hallucinations by Image Head Masked Contrastive Decoding

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:MaskCD: Mitigating LVLM Hallucinations by Image Head Masked Contrastive Decoding

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators