Concept-aware Training Improves In-context Learning Ability of Language Models

Štefánik, Michal; Kadlčík, Marek

Computer Science > Computation and Language

arXiv:2305.13775 (cs)

[Submitted on 23 May 2023]

Title:Concept-aware Training Improves In-context Learning Ability of Language Models

Authors:Michal Štefánik, Marek Kadlčík

View PDF

Abstract:Many recent language models (LMs) of Transformers family exhibit so-called in-context learning (ICL) ability, manifested in the LMs' ability to modulate their function by a task described in a natural language input. Previous work curating these models assumes that ICL emerges from vast over-parametrization or the scale of multi-task training. However, a complementary branch of recent theoretical work attributes ICL emergence to specific properties of training data and creates functional in-context learners in small-scale, synthetic settings.
Inspired by recent findings on data properties driving the emergence of ICL, we propose a method to create LMs able to better utilize the in-context information, by constructing training scenarios where it is beneficial for the LM to capture the analogical reasoning concepts. We measure that data sampling of Concept-aware Training (CoAT) consistently improves models' reasoning ability. As a result, the in-context learners trained with CoAT on only two datasets of a single (QA) task perform comparably to larger models trained on 1600+ tasks.

Comments:	Work in progress
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2305.13775 [cs.CL]
	(or arXiv:2305.13775v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2305.13775

Submission history

From: Michal Štefánik [view email]
[v1] Tue, 23 May 2023 07:44:52 UTC (311 KB)

Computer Science > Computation and Language

Title:Concept-aware Training Improves In-context Learning Ability of Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Concept-aware Training Improves In-context Learning Ability of Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators