FlexModel: A Framework for Interpretability of Distributed Large Language Models

Choi, Matthew; Asif, Muhammad Adil; Willes, John; Emerson, David

Computer Science > Machine Learning

arXiv:2312.03140 (cs)

[Submitted on 5 Dec 2023]

Title:FlexModel: A Framework for Interpretability of Distributed Large Language Models

Authors:Matthew Choi, Muhammad Adil Asif, John Willes, David Emerson

View PDF HTML (experimental)

Abstract:With the growth of large language models, now incorporating billions of parameters, the hardware prerequisites for their training and deployment have seen a corresponding increase. Although existing tools facilitate model parallelization and distributed training, deeper model interactions, crucial for interpretability and responsible AI techniques, still demand thorough knowledge of distributed computing. This often hinders contributions from researchers with machine learning expertise but limited distributed computing background. Addressing this challenge, we present FlexModel, a software package providing a streamlined interface for engaging with models distributed across multi-GPU and multi-node configurations. The library is compatible with existing model distribution libraries and encapsulates PyTorch models. It exposes user-registerable HookFunctions to facilitate straightforward interaction with distributed model internals, bridging the gap between distributed and single-device model paradigms. Primarily, FlexModel enhances accessibility by democratizing model interactions and promotes more inclusive research in the domain of large-scale neural networks. The package is found at this https URL.

Comments:	14 pages, 8 figures. To appear at the Socially Responsible Language Modelling Research (SoLaR) Workshop, 37th Conference on Neural Information Processing Systems (NeurIPS 2023)
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:2312.03140 [cs.LG]
	(or arXiv:2312.03140v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2312.03140

Submission history

From: Muhammad Adil Asif [view email]
[v1] Tue, 5 Dec 2023 21:19:33 UTC (906 KB)

Computer Science > Machine Learning

Title:FlexModel: A Framework for Interpretability of Distributed Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:FlexModel: A Framework for Interpretability of Distributed Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators