Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-start

Grazzi, Riccardo; Pontil, Massimiliano; Salzo, Saverio

Statistics > Machine Learning

arXiv:2202.03397 (stat)

[Submitted on 7 Feb 2022 (v1), last revised 16 Nov 2023 (this version, v4)]

Title:Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-start

Authors:Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo

View PDF

Abstract:We analyse a general class of bilevel problems, in which the upper-level problem consists in the minimization of a smooth objective function and the lower-level problem is to find the fixed point of a smooth contraction map. This type of problems include instances of meta-learning, equilibrium models, hyperparameter optimization and data poisoning adversarial attacks. Several recent works have proposed algorithms which warm-start the lower-level problem, i.e.~they use the previous lower-level approximate solution as a staring point for the lower-level solver. This warm-start procedure allows one to improve the sample complexity in both the stochastic and deterministic settings, achieving in some cases the order-wise optimal sample complexity. However, there are situations, e.g., meta learning and equilibrium models, in which the warm-start procedure is not well-suited or ineffective. In this work we show that without warm-start, it is still possible to achieve order-wise (near) optimal sample complexity. In particular, we propose a simple method which uses (stochastic) fixed point iterations at the lower-level and projected inexact gradient descent at the upper-level, that reaches an $\epsilon$-stationary point using $O(\epsilon^{-2})$ and $\tilde{O}(\epsilon^{-1})$ samples for the stochastic and the deterministic setting, respectively. Finally, compared to methods using warm-start, our approach yields a simpler analysis that does not need to study the coupled interactions between the upper-level and lower-level iterates.

Comments:	Corrected Remark 18 + other small edits. Code at this https URL
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as:	arXiv:2202.03397 [stat.ML]
	(or arXiv:2202.03397v4 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2202.03397
Journal reference:	Journal of Machine Learning Research, volume 24, number 167, pages 1-37, year 2023

Submission history

From: Riccardo Grazzi [view email]
[v1] Mon, 7 Feb 2022 18:35:46 UTC (33 KB)
[v2] Mon, 12 Sep 2022 15:34:24 UTC (165 KB)
[v3] Tue, 30 May 2023 08:23:30 UTC (170 KB)
[v4] Thu, 16 Nov 2023 11:13:55 UTC (166 KB)

Statistics > Machine Learning

Title:Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-start

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-start

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators