Pro-tuning: Unified Prompt Tuning for Vision Tasks

Nie, Xing; Ni, Bolin; Chang, Jianlong; Meng, Gaomeng; Huo, Chunlei; Zhang, Zhaoxiang; Xiang, Shiming; Tian, Qi; Pan, Chunhong

Computer Science > Computer Vision and Pattern Recognition

arXiv:2207.14381 (cs)

[Submitted on 28 Jul 2022 (v1), last revised 23 Aug 2022 (this version, v3)]

Title:Pro-tuning: Unified Prompt Tuning for Vision Tasks

Authors:Xing Nie, Bolin Ni, Jianlong Chang, Gaomeng Meng, Chunlei Huo, Zhaoxiang Zhang, Shiming Xiang, Qi Tian, Chunhong Pan

View PDF

Abstract:In computer vision, fine-tuning is the de-facto approach to leverage pre-trained vision models to perform downstream tasks. However, deploying it in practice is quite challenging, due to adopting parameter inefficient global update and heavily relying on high-quality downstream data. Recently, prompt-based learning, which adds a task-relevant prompt to adapt the downstream tasks to pre-trained models, has drastically boosted the performance of many natural language downstream tasks. In this work, we extend this notable transfer ability benefited from prompt into vision models as an alternative to fine-tuning. To this end, we propose parameter-efficient Prompt tuning (Pro-tuning) to adapt frozen vision models to various downstream vision tasks. The key to Pro-tuning is prompt-based tuning, i.e., learning task-specific vision prompts for downstream input images with the pre-trained model frozen. By only training a few additional parameters, it can work on diverse CNN-based and Transformer-based architectures. Extensive experiments evidence that Pro-tuning outperforms fine-tuning in a broad range of vision tasks and scenarios, including image classification (generic objects, class imbalance, image corruption, adversarial robustness, and out-of-distribution generalization), and dense prediction tasks such as object detection and semantic segmentation.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2207.14381 [cs.CV]
	(or arXiv:2207.14381v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2207.14381

Submission history

From: Xing Nie [view email]
[v1] Thu, 28 Jul 2022 21:09:31 UTC (169 KB)
[v2] Sun, 14 Aug 2022 18:16:35 UTC (169 KB)
[v3] Tue, 23 Aug 2022 03:39:05 UTC (1,919 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Pro-tuning: Unified Prompt Tuning for Vision Tasks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Pro-tuning: Unified Prompt Tuning for Vision Tasks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators