Simple and Effective Input Reformulations for Translation

Yu, Brian; Lillemark, Hansen; Keutzer, Kurt

doi:10.18653/v1/2023.emnlp-main.638

Computer Science > Computation and Language

arXiv:2311.06696 (cs)

[Submitted on 12 Nov 2023]

Title:Simple and Effective Input Reformulations for Translation

Authors:Brian Yu, Hansen Lillemark, Kurt Keutzer

View PDF

Abstract:Foundation language models learn from their finetuning input context in different ways. In this paper, we reformulate inputs during finetuning for challenging translation tasks, leveraging model strengths from pretraining in novel ways to improve downstream performance. These reformulations are simple data level modifications, require no additional collection of training data or modification of data at inference time. They can be applied either on single language pair translation tasks or massively multilingual translation tasks. Experiments with these techniques demonstrate significant performance improvements up to $\textbf{3.5 chrF++ on the Flores200 translation benchmark}$. We hope our research accessibly improves finetuning data efficiency, enabling more effective training to scalably improve state-of-the-art performance. Our code is released $\href{this https URL}{here}.$

Comments:	13 pages, 6 figures. To be published in Empirical Methods in Natural Language Processing (Main) 2023
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2311.06696 [cs.CL]
	(or arXiv:2311.06696v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2311.06696
Related DOI:	https://doi.org/10.18653/v1/2023.emnlp-main.638

Submission history

From: Hansen Lillemark [view email]
[v1] Sun, 12 Nov 2023 00:23:37 UTC (7,405 KB)

Computer Science > Computation and Language

Title:Simple and Effective Input Reformulations for Translation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Simple and Effective Input Reformulations for Translation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators