Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Propagation Approach

Yu, Yue; Zhang, Rongzhi; Xu, Ran; Zhang, Jieyu; Shen, Jiaming; Zhang, Chao

Computer Science > Computation and Language

arXiv:2209.06995 (cs)

[Submitted on 15 Sep 2022 (v1), last revised 8 May 2023 (this version, v2)]

Title:Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Propagation Approach

Authors:Yue Yu, Rongzhi Zhang, Ran Xu, Jieyu Zhang, Jiaming Shen, Chao Zhang

View PDF

Abstract:Large Language Models have demonstrated remarkable few-shot performance, but the performance can be sensitive to the selection of few-shot instances. We propose PATRON, a new method that uses prompt-based uncertainty estimation for data selection for pre-trained language model fine-tuning under cold-start scenarios, i.e., no initial labeled data are available. In PATRON, we design (1) a prompt-based uncertainty propagation approach to estimate the importance of data points and (2) a partition-then-rewrite (PTR) strategy to promote sample diversity when querying for annotations. Experiments on six text classification datasets show that PATRON outperforms the strongest cold-start data selection baselines by up to 6.9%. Besides, with 128 labels only, PATRON achieves 91.0% and 92.1% of the fully supervised performance based on vanilla fine-tuning and prompt-based learning respectively. Our implementation of PATRON is available at \url{this https URL}.

Comments:	ACL 2023 Main Conference. Code: this https URL
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2209.06995 [cs.CL]
	(or arXiv:2209.06995v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2209.06995
Journal reference:	ACL 2023

Submission history

From: Yue Yu [view email]
[v1] Thu, 15 Sep 2022 01:51:22 UTC (1,172 KB)
[v2] Mon, 8 May 2023 20:42:26 UTC (1,191 KB)

Computer Science > Computation and Language

Title:Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Propagation Approach

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Cold-Start Data Selection for Few-shot Language Model Fine-tuning: A Prompt-Based Uncertainty Propagation Approach

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators