Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes

Qin, Zhen; Chen, Daoyuan; Qian, Bingchen; Ding, Bolin; Li, Yaliang; Deng, Shuiguang

Computer Science > Machine Learning

arXiv:2312.06353 (cs)

[Submitted on 11 Dec 2023 (v1), last revised 27 May 2024 (this version, v5)]

Title:Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes

Authors:Zhen Qin, Daoyuan Chen, Bingchen Qian, Bolin Ding, Yaliang Li, Shuiguang Deng

View PDF HTML (experimental)

Abstract:Pre-trained large language models (LLMs) need fine-tuning to improve their responsiveness to natural language instructions. Federated learning offers a way to fine-tune LLMs using the abundant data on end devices without compromising data privacy. Most existing federated fine-tuning methods for LLMs rely on parameter-efficient fine-tuning techniques, which may not reach the performance height possible with full-parameter tuning. However, federated full-parameter tuning of LLMs is a non-trivial problem due to the immense communication cost. This work introduces FedKSeed that employs zeroth-order optimization with a finite set of random seeds. It significantly reduces transmission requirements between the server and clients to just a few random seeds and scalar gradients, amounting to only a few thousand bytes, making federated full-parameter tuning of billion-sized LLMs possible on devices. Building on it, we develop a strategy enabling probability-differentiated seed sampling, prioritizing perturbations with greater impact on model accuracy. Experiments across six scenarios with various LLMs, datasets and data partitions demonstrate that our approach outperforms existing federated LLM fine-tuning methods in both communication efficiency and new task generalization.

Comments:	Accepted to ICML 2024. 25 pages, 14 figures, 7 tables. Codes are available at this https URL
Subjects:	Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:2312.06353 [cs.LG]
	(or arXiv:2312.06353v5 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2312.06353

Submission history

From: Zhen Qin [view email]
[v1] Mon, 11 Dec 2023 13:03:21 UTC (844 KB)
[v2] Tue, 26 Dec 2023 03:37:35 UTC (844 KB)
[v3] Wed, 31 Jan 2024 11:49:06 UTC (1,823 KB)
[v4] Wed, 15 May 2024 14:59:38 UTC (2,075 KB)
[v5] Mon, 27 May 2024 08:31:47 UTC (2,075 KB)

Computer Science > Machine Learning

Title:Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators