GreenPLM: Cross-lingual pre-trained language models conversion with (almost) no cost

Zeng, Qingcheng; Garay, Lucas; Zhou, Peilin; Chong, Dading; Hua, Yining; Wu, Jiageng; Pan, Yikang; Zhou, Han; Yang, Jie

Computer Science > Computation and Language

arXiv:2211.06993v1 (cs)

[Submitted on 13 Nov 2022 (this version), latest version 26 May 2023 (v3)]

Title:GreenPLM: Cross-lingual pre-trained language models conversion with (almost) no cost

Authors:Qingcheng Zeng, Lucas Garay, Peilin Zhou, Dading Chong, Yining Hua, Jiageng Wu, Yikang Pan, Han Zhou, Jie Yang

View PDF

Abstract:While large pre-trained models have transformed the field of natural language processing (NLP), the high training cost and low cross-lingual availability of such models prevent the new advances from being equally shared by users across all languages, especially the less spoken ones. To promote equal opportunities for all language speakers in NLP research and to reduce energy consumption for sustainability, this study proposes an effective and energy-efficient framework GreenPLM that uses bilingual lexicons to directly translate language models of one language into other languages at (almost) no additional cost. We validate this approach in 18 languages and show that this framework is comparable to, if not better than, other heuristics trained with high cost. In addition, when given a low computational cost (2.5%), the framework outperforms the original monolingual language models in six out of seven tested languages. This approach can be easily implemented, and we will release language models in 50 languages translated from English soon.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2211.06993 [cs.CL]
	(or arXiv:2211.06993v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2211.06993

Submission history

From: Qingcheng Zeng [view email]
[v1] Sun, 13 Nov 2022 18:59:15 UTC (2,193 KB)
[v2] Tue, 29 Nov 2022 20:45:15 UTC (2,193 KB)
[v3] Fri, 26 May 2023 13:28:36 UTC (548 KB)

Computer Science > Computation and Language

Title:GreenPLM: Cross-lingual pre-trained language models conversion with (almost) no cost

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:GreenPLM: Cross-lingual pre-trained language models conversion with (almost) no cost

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators