Advantageous Parameter Expansion Training Makes Better Large Language Models

Gu, Naibin; Chen, Yilong; Zhang, Zhenyu; Fu, Peng; Lin, Zheng; Wang, Shuohuan; Sun, Yu; Wu, Hua; Wang, Weiping; Wang, Haifeng

Computer Science > Computation and Language

arXiv:2505.24241 (cs)

[Submitted on 30 May 2025]

Title:Advantageous Parameter Expansion Training Makes Better Large Language Models

Authors:Naibin Gu, Yilong Chen, Zhenyu Zhang, Peng Fu, Zheng Lin, Shuohuan Wang, Yu Sun, Hua Wu, Weiping Wang, Haifeng Wang

View PDF HTML (experimental)

Abstract:Although scaling up the number of trainable parameters in both pre-training and fine-tuning can effectively improve the performance of large language models, it also leads to increased computational overhead. When delving into the parameter difference, we find that a subset of parameters, termed advantageous parameters, plays a crucial role in determining model performance. Further analysis reveals that stronger models tend to possess more such parameters. In this paper, we propose Advantageous Parameter EXpansion Training (APEX), a method that progressively expands advantageous parameters into the space of disadvantageous ones, thereby increasing their proportion and enhancing training effectiveness. Further theoretical analysis from the perspective of matrix effective rank explains the performance gains of APEX. Extensive experiments on both instruction tuning and continued pre-training demonstrate that, in instruction tuning, APEX outperforms full-parameter tuning while using only 52% of the trainable parameters. In continued pre-training, APEX achieves the same perplexity level as conventional training with just 33% of the training data, and yields significant improvements on downstream tasks.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2505.24241 [cs.CL]
	(or arXiv:2505.24241v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2505.24241

Submission history

From: Naibin Gu [view email]
[v1] Fri, 30 May 2025 06:06:23 UTC (405 KB)

Computer Science > Computation and Language

Title:Advantageous Parameter Expansion Training Makes Better Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Advantageous Parameter Expansion Training Makes Better Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators