Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Muñoz, J. Pablo; Yuan, Jinjie; Jain, Nilesh

Computer Science > Machine Learning

arXiv:2501.16372 (cs)

[Submitted on 23 Jan 2025]

Title:Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Authors:J. Pablo Muñoz, Jinjie Yuan, Nilesh Jain

View PDF HTML (experimental)

Abstract:The rapid expansion of Large Language Models (LLMs) has posed significant challenges regarding the computational resources required for fine-tuning and deployment. Recent advancements in low-rank adapters have demonstrated their efficacy in parameter-efficient fine-tuning (PEFT) of these models. This retrospective paper comprehensively discusses innovative approaches that synergize low-rank representations with Neural Architecture Search (NAS) techniques, particularly weight-sharing super-networks. Robust solutions for compressing and fine-tuning large pre-trained models are developed by integrating these methodologies. Our analysis highlights the potential of these combined strategies to democratize the use of LLMs, making them more accessible for deployment in resource-constrained environments. The resulting models exhibit reduced memory footprints and faster inference times, paving the way for more practical and scalable applications of LLMs. Models and code are available at this https URL.

Comments:	AAAI-25 Workshop on Connecting Low-rank Representations in AI
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2501.16372 [cs.LG]
	(or arXiv:2501.16372v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2501.16372

Submission history

From: Juan Pablo Muñoz [view email]
[v1] Thu, 23 Jan 2025 02:14:08 UTC (541 KB)

Computer Science > Machine Learning

Title:Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators