The Importance of Being Lazy: Scaling Limits of Continual Learning

Graldi, Jacopo; Breccia, Alessandro; Lanzillotta, Giulia; Hofmann, Thomas; Noci, Lorenzo

Computer Science > Machine Learning

arXiv:2506.16884 (cs)

[Submitted on 20 Jun 2025]

Title:The Importance of Being Lazy: Scaling Limits of Continual Learning

Authors:Jacopo Graldi, Alessandro Breccia, Giulia Lanzillotta, Thomas Hofmann, Lorenzo Noci

View PDF HTML (experimental)

Abstract:Despite recent efforts, neural networks still struggle to learn in non-stationary environments, and our understanding of catastrophic forgetting (CF) is far from complete. In this work, we perform a systematic study on the impact of model scale and the degree of feature learning in continual learning. We reconcile existing contradictory observations on scale in the literature, by differentiating between lazy and rich training regimes through a variable parameterization of the architecture. We show that increasing model width is only beneficial when it reduces the amount of feature learning, yielding more laziness. Using the framework of dynamical mean field theory, we then study the infinite width dynamics of the model in the feature learning regime and characterize CF, extending prior theoretical results limited to the lazy regime. We study the intricate relationship between feature learning, task non-stationarity, and forgetting, finding that high feature learning is only beneficial with highly similar tasks. We identify a transition modulated by task similarity where the model exits an effectively lazy regime with low forgetting to enter a rich regime with significant forgetting. Finally, our findings reveal that neural networks achieve optimal performance at a critical level of feature learning, which depends on task non-stationarity and transfers across model scales. This work provides a unified perspective on the role of scale and feature learning in continual learning.

Comments:	Proceedings of the 42nd International Conference on Machine Learning (2025). JG and AB contributed equally to this work
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:2506.16884 [cs.LG]
	(or arXiv:2506.16884v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2506.16884

Submission history

From: Jacopo Graldi [view email]
[v1] Fri, 20 Jun 2025 10:12:38 UTC (6,039 KB)

Computer Science > Machine Learning

Title:The Importance of Being Lazy: Scaling Limits of Continual Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:The Importance of Being Lazy: Scaling Limits of Continual Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators