Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

Yaras, Can; Wang, Peng; Balzano, Laura; Qu, Qing

Computer Science > Machine Learning

arXiv:2406.04112 (cs)

[Submitted on 6 Jun 2024 (v1), last revised 10 Jun 2024 (this version, v2)]

Title:Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

Authors:Can Yaras, Peng Wang, Laura Balzano, Qing Qu

View PDF HTML (experimental)

Abstract:While overparameterization in machine learning models offers great benefits in terms of optimization and generalization, it also leads to increased computational requirements as model sizes grow. In this work, we show that by leveraging the inherent low-dimensional structures of data and compressible dynamics within the model parameters, we can reap the benefits of overparameterization without the computational burdens. In practice, we demonstrate the effectiveness of this approach for deep low-rank matrix completion as well as fine-tuning language models. Our approach is grounded in theoretical findings for deep overparameterized low-rank matrix recovery, where we show that the learning dynamics of each weight matrix are confined to an invariant low-dimensional subspace. Consequently, we can construct and train compact, highly compressed factorizations possessing the same benefits as their overparameterized counterparts. In the context of deep matrix completion, our technique substantially improves training efficiency while retaining the advantages of overparameterization. For language model fine-tuning, we propose a method called "Deep LoRA", which improves the existing low-rank adaptation (LoRA) technique, leading to reduced overfitting and a simplified hyperparameter setup, while maintaining comparable efficiency. We validate the effectiveness of Deep LoRA on natural language tasks, particularly when fine-tuning with limited data. Our code is available at this https URL.

Comments:	Accepted at ICML'24 (Oral)
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Signal Processing (eess.SP); Machine Learning (stat.ML)
Cite as:	arXiv:2406.04112 [cs.LG]
	(or arXiv:2406.04112v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2406.04112

Submission history

From: Can Yaras [view email]
[v1] Thu, 6 Jun 2024 14:29:49 UTC (10,775 KB)
[v2] Mon, 10 Jun 2024 02:05:26 UTC (10,830 KB)

Computer Science > Machine Learning

Title:Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators