Transformer-Based Learned Optimization

Gärtner, Erik; Metz, Luke; Andriluka, Mykhaylo; Freeman, C. Daniel; Sminchisescu, Cristian

Computer Science > Computer Vision and Pattern Recognition

arXiv:2212.01055 (cs)

[Submitted on 2 Dec 2022 (v1), last revised 28 Jun 2023 (this version, v4)]

Title:Transformer-Based Learned Optimization

Authors:Erik Gärtner, Luke Metz, Mykhaylo Andriluka, C. Daniel Freeman, Cristian Sminchisescu

View PDF

Abstract:We propose a new approach to learned optimization where we represent the computation of an optimizer's update step using a neural network. The parameters of the optimizer are then learned by training on a set of optimization tasks with the objective to perform minimization efficiently. Our innovation is a new neural network architecture, Optimus, for the learned optimizer inspired by the classic BFGS algorithm. As in BFGS, we estimate a preconditioning matrix as a sum of rank-one updates but use a Transformer-based neural network to predict these updates jointly with the step length and direction. In contrast to several recent learned optimization-based approaches, our formulation allows for conditioning across the dimensions of the parameter space of the target problem while remaining applicable to optimization tasks of variable dimensionality without retraining. We demonstrate the advantages of our approach on a benchmark composed of objective functions traditionally used for the evaluation of optimization algorithms, as well as on the real world-task of physics-based visual reconstruction of articulated 3d human motion.

Comments:	Accepted to the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (CVPR) in Vancouver, Canada
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2212.01055 [cs.CV]
	(or arXiv:2212.01055v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2212.01055

Submission history

From: Erik Gärtner [view email]
[v1] Fri, 2 Dec 2022 09:47:08 UTC (14,267 KB)
[v2] Wed, 5 Apr 2023 16:46:59 UTC (7,467 KB)
[v3] Wed, 24 May 2023 14:40:47 UTC (7,466 KB)
[v4] Wed, 28 Jun 2023 09:23:08 UTC (7,467 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Transformer-Based Learned Optimization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Transformer-Based Learned Optimization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators