Retrieval-enriched zero-shot image classification in low-resource domains

Dall'Asen, Nicola; Wang, Yiming; Fini, Enrico; Ricci, Elisa

Computer Science > Computer Vision and Pattern Recognition

arXiv:2411.00988 (cs)

[Submitted on 1 Nov 2024]

Title:Retrieval-enriched zero-shot image classification in low-resource domains

Authors:Nicola Dall'Asen, Yiming Wang, Enrico Fini, Elisa Ricci

View PDF HTML (experimental)

Abstract:Low-resource domains, characterized by scarce data and annotations, present significant challenges for language and visual understanding tasks, with the latter much under-explored in the literature. Recent advancements in Vision-Language Models (VLM) have shown promising results in high-resource domains but fall short in low-resource concepts that are under-represented (e.g. only a handful of images per category) in the pre-training set. We tackle the challenging task of zero-shot low-resource image classification from a novel perspective. By leveraging a retrieval-based strategy, we achieve this in a training-free fashion. Specifically, our method, named CoRE (Combination of Retrieval Enrichment), enriches the representation of both query images and class prototypes by retrieving relevant textual information from large web-crawled databases. This retrieval-based enrichment significantly boosts classification performance by incorporating the broader contextual information relevant to the specific class. We validate our method on a newly established benchmark covering diverse low-resource domains, including medical imaging, rare plants, and circuits. Our experiments demonstrate that CORE outperforms existing state-of-the-art methods that rely on synthetic data generation and model fine-tuning.

Comments:	Accepted to EMNLP 2024 (Main)
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2411.00988 [cs.CV]
	(or arXiv:2411.00988v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2411.00988

Submission history

From: Nicola Dall'Asen [view email]
[v1] Fri, 1 Nov 2024 19:24:55 UTC (9,910 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Retrieval-enriched zero-shot image classification in low-resource domains

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Retrieval-enriched zero-shot image classification in low-resource domains

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators