Adacc: An Adaptive Framework Unifying Compression and Activation Recomputation for LLM Training

Chen, Ping; Deng, Zhuohong; Li, Ping; He, Shuibing; Zhu, Hongzi; Zheng, Yi; Wang, Zhefeng; Huai, Baoxing; Guo, Minyi

Computer Science > Machine Learning

arXiv:2508.00806 (cs)

[Submitted on 1 Aug 2025 (v1), last revised 8 Aug 2025 (this version, v2)]

Title:Adacc: An Adaptive Framework Unifying Compression and Activation Recomputation for LLM Training

Authors:Ping Chen, Zhuohong Deng, Ping Li, Shuibing He, Hongzi Zhu, Yi Zheng, Zhefeng Wang, Baoxing Huai, Minyi Guo

View PDF HTML (experimental)

Abstract:Training large language models (LLMs) is often constrained by GPU memory limitations. To alleviate memory pressure, activation recomputation and data compression have been proposed as two major strategies. However, both approaches have limitations: recomputation introduces significant training overhead, while compression can lead to accuracy degradation and computational inefficiency when applied naively. In this paper, we propose Adacc, the first adaptive memory optimization framework that unifies activation recomputation and data compression to improve training efficiency for LLMs while preserving model accuracy. Unlike existing methods that apply static, rule-based strategies or rely solely on one technique, Adacc makes fine-grained, tensor-level decisions, dynamically selecting between recomputation, retention, and compression based on tensor characteristics and runtime hardware constraints.
Adacc tackles three key challenges: (1) it introduces layer-specific compression algorithms that mitigate accuracy loss by accounting for outliers in LLM activations; (2) it employs a MILP-based scheduling policy to globally optimize memory strategies across layers; and (3) it integrates an adaptive policy evolution mechanism to update strategies during training in response to changing data distributions. Experimental results show that Adacc improves training throughput by 1.01x to 1.37x compared to state-of-the-art frameworks, while maintaining accuracy comparable to the baseline.

Comments:	8 pages
Subjects:	Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as:	arXiv:2508.00806 [cs.LG]
	(or arXiv:2508.00806v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2508.00806

Submission history

From: Ping Chen [view email]
[v1] Fri, 1 Aug 2025 17:39:25 UTC (3,252 KB)
[v2] Fri, 8 Aug 2025 09:49:52 UTC (1,118 KB)

Computer Science > Machine Learning

Title:Adacc: An Adaptive Framework Unifying Compression and Activation Recomputation for LLM Training

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Adacc: An Adaptive Framework Unifying Compression and Activation Recomputation for LLM Training

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators