-
A sharp-interface discontinuous Galerkin method for simulation of two-phase flow of real gases based on implicit shock tracking
Authors:
Charles Naudet,
Brian Taylor,
Matthew J. Zahr
Abstract:
We present a high-order, sharp-interface method for simulation of two-phase flow of real gases using implicit shock tracking. The method is based on a phase-field formulation of two-phase, compressible, inviscid flow with a trivial mixture model. Implicit shock tracking is a high-order, optimization-based discontinuous Galerkin method that automatically aligns mesh faces with non-smooth flow featu…
▽ More
We present a high-order, sharp-interface method for simulation of two-phase flow of real gases using implicit shock tracking. The method is based on a phase-field formulation of two-phase, compressible, inviscid flow with a trivial mixture model. Implicit shock tracking is a high-order, optimization-based discontinuous Galerkin method that automatically aligns mesh faces with non-smooth flow features to represent them perfectly with inter-element jumps. It is used to accurately approximate shocks and rarefactions without stabilization and converge the phase-field solution to a sharp interface one by aligning mesh faces with the material interface. Time-dependent problems are formulated as steady problems in a space-time domain where complex wave interactions (e.g., intersections and reflections) manifest as space-time triplet points. The space-time formulation avoids complex re-meshing and solution transfer that would be required to track moving waves with mesh faces using the method of lines. The approach is applied to several two-phase flow Riemann problems involving gases with ideal, stiffened gas, and Becker-Kistiakowsky-Wilson (BKW) equations of state, including a spherically symmetric underwater explosion problem. In all cases, the method aligns element faces with all shocks (including secondary shocks that form at time t > 0), rarefactions, and material interfaces, and accurately resolves the flow field on coarse space-time grids.
△ Less
Submitted 7 March, 2025;
originally announced March 2025.
-
On Fundamental Proof Structures in First-Order Optimization
Authors:
Baptiste Goujaud,
Aymeric Dieuleveut,
Adrien Taylor
Abstract:
First-order optimization methods have attracted a lot of attention due to their practical success in many applications, including in machine learning. Obtaining convergence guarantees and worst-case performance certificates for first-order methods have become crucial for understanding ingredients underlying efficient methods and for developing new ones. However, obtaining, verifying, and proving s…
▽ More
First-order optimization methods have attracted a lot of attention due to their practical success in many applications, including in machine learning. Obtaining convergence guarantees and worst-case performance certificates for first-order methods have become crucial for understanding ingredients underlying efficient methods and for developing new ones. However, obtaining, verifying, and proving such guarantees is often a tedious task. Therefore, a few approaches were proposed for rendering this task more systematic, and even partially automated. In addition to helping researchers finding convergence proofs, these tools provide insights on the general structures of such proofs. We aim at presenting those structures, showing how to build convergence guarantees for first-order optimization methods.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Rate-Induced Transitions in Networked Complex Adaptive Systems: Exploring Dynamics and Management Implications Across Ecological, Social, and Socioecological Systems
Authors:
Vítor V. Vasconcelos,
Flávia M. D. Marquitti,
Theresa Ong,
Lisa C. McManus,
Marcus Aguiar,
Amanda B. Campos,
Partha S. Dutta,
Kristen Jovanelly,
Victoria Junquera,
Jude Kong,
Elisabeth H. Krueger,
Simon A. Levin,
Wenying Liao,
Mingzhen Lu,
Dhruv Mittal,
Mercedes Pascual,
Flávio L. Pinheiro,
Juan Rocha,
Fernando P. Santos,
Peter Sloot,
Chenyang,
Su,
Benton Taylor,
Eden Tekwa,
Sjoerd Terpstra
, et al. (5 additional authors not shown)
Abstract:
Complex adaptive systems (CASs), from ecosystems to economies, are open systems and inherently dependent on external conditions. While a system can transition from one state to another based on the magnitude of change in external conditions, the rate of change -- irrespective of magnitude -- may also lead to system state changes due to a phenomenon known as a rate-induced transition (RIT). This st…
▽ More
Complex adaptive systems (CASs), from ecosystems to economies, are open systems and inherently dependent on external conditions. While a system can transition from one state to another based on the magnitude of change in external conditions, the rate of change -- irrespective of magnitude -- may also lead to system state changes due to a phenomenon known as a rate-induced transition (RIT). This study presents a novel framework that captures RITs in CASs through a local model and a network extension where each node contributes to the structural adaptability of others. Our findings reveal how RITs occur at a critical environmental change rate, with lower-degree nodes tipping first due to fewer connections and reduced adaptive capacity. High-degree nodes tip later as their adaptability sources (lower-degree nodes) collapse. This pattern persists across various network structures. Our study calls for an extended perspective when managing CASs, emphasizing the need to focus not only on thresholds of external conditions but also the rate at which those conditions change, particularly in the context of the collapse of surrounding systems that contribute to the focal system's resilience. Our analytical method opens a path to designing management policies that mitigate RIT impacts and enhance resilience in ecological, social, and socioecological systems. These policies could include controlling environmental change rates, fostering system adaptability, implementing adaptive management strategies, and building capacity and knowledge exchange. Our study contributes to the understanding of RIT dynamics and informs effective management strategies for complex adaptive systems in the face of rapid environmental change.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Counter-examples in first-order optimization: a constructive approach
Authors:
Baptiste Goujaud,
Aymeric Dieuleveut,
Adrien Taylor
Abstract:
While many approaches were developed for obtaining worst-case complexity bounds for first-order optimization methods in the last years, there remain theoretical gaps in cases where no such bound can be found. In such cases, it is often unclear whether no such bound exists (e.g., because the algorithm might fail to systematically converge) or simply if the current techniques do not allow finding th…
▽ More
While many approaches were developed for obtaining worst-case complexity bounds for first-order optimization methods in the last years, there remain theoretical gaps in cases where no such bound can be found. In such cases, it is often unclear whether no such bound exists (e.g., because the algorithm might fail to systematically converge) or simply if the current techniques do not allow finding them.
In this work, we propose an approach to automate the search for cyclic trajectories generated by first-order methods. This provides a constructive approach to show that no appropriate complexity bound exists, thereby complementing the approaches providing sufficient conditions for convergence. Using this tool, we provide ranges of parameters for which some of the famous heavy-ball, Nesterov accelerated gradient, inexact gradient descent, and three-operator splitting algorithms fail to systematically converge, and show that it nicely complements existing tools searching for Lyapunov functions.
△ Less
Submitted 9 January, 2025; v1 submitted 18 March, 2023;
originally announced March 2023.
-
Automated tight Lyapunov analysis for first-order methods
Authors:
Manu Upadhyaya,
Sebastian Banert,
Adrien B. Taylor,
Pontus Giselsson
Abstract:
We present a methodology for establishing the existence of quadratic Lyapunov inequalities for a wide range of first-order methods used to solve convex optimization problems. In particular, we consider (i) classes of optimization problems of finite-sum form with (possibly strongly) convex and possibly smooth functional components, (ii) first-order methods that can be written as a linear system on…
▽ More
We present a methodology for establishing the existence of quadratic Lyapunov inequalities for a wide range of first-order methods used to solve convex optimization problems. In particular, we consider (i) classes of optimization problems of finite-sum form with (possibly strongly) convex and possibly smooth functional components, (ii) first-order methods that can be written as a linear system on state-space form in feedback interconnection with the subdifferentials of the functional components of the objective function, and (iii) quadratic Lyapunov inequalities that can be used to draw convergence conclusions. We present a necessary and sufficient condition for the existence of a quadratic Lyapunov inequality within a predefined class of Lyapunov inequalities, which amounts to solving a small-sized semidefinite program. We showcase our methodology on several first-order methods that fit the framework. Most notably, our methodology allows us to significantly extend the region of parameter choices that allow for duality gap convergence in the Chambolle-Pock method when the linear operator is the identity mapping.
△ Less
Submitted 27 February, 2024; v1 submitted 13 February, 2023;
originally announced February 2023.
-
Nonlinear conjugate gradient methods: worst-case convergence rates via computer-assisted analyses
Authors:
Shuvomoy Das Gupta,
Robert M. Freund,
Xu Andy Sun,
Adrien Taylor
Abstract:
We propose a computer-assisted approach to the analysis of the worst-case convergence of nonlinear conjugate gradient methods (NCGMs). Those methods are known for their generally good empirical performances for large-scale optimization, while having relatively incomplete analyses. Using our computer-assisted approach, we establish novel complexity bounds for the Polak-Ribière-Polyak (PRP) and the…
▽ More
We propose a computer-assisted approach to the analysis of the worst-case convergence of nonlinear conjugate gradient methods (NCGMs). Those methods are known for their generally good empirical performances for large-scale optimization, while having relatively incomplete analyses. Using our computer-assisted approach, we establish novel complexity bounds for the Polak-Ribière-Polyak (PRP) and the Fletcher-Reeves (FR) NCGMs for smooth strongly convex minimization. In particular, we construct mathematical proofs that establish the first non-asymptotic convergence bound for FR (which is historically the first developed NCGM), and a much improved non-asymptotic convergence bound for PRP. Additionally, we provide simple adversarial examples on which these methods do not perform better than gradient descent with exact line search, leaving very little room for improvements on the same class of problems.
△ Less
Submitted 18 September, 2024; v1 submitted 4 January, 2023;
originally announced January 2023.
-
Convergence of Proximal Point and Extragradient-Based Methods Beyond Monotonicity: the Case of Negative Comonotonicity
Authors:
Eduard Gorbunov,
Adrien Taylor,
Samuel Horváth,
Gauthier Gidel
Abstract:
Algorithms for min-max optimization and variational inequalities are often studied under monotonicity assumptions. Motivated by non-monotone machine learning applications, we follow the line of works [Diakonikolas et al., 2021, Lee and Kim, 2021, Pethick et al., 2022, Böhm, 2022] aiming at going beyond monotonicity by considering the weaker negative comonotonicity assumption. In particular, we pro…
▽ More
Algorithms for min-max optimization and variational inequalities are often studied under monotonicity assumptions. Motivated by non-monotone machine learning applications, we follow the line of works [Diakonikolas et al., 2021, Lee and Kim, 2021, Pethick et al., 2022, Böhm, 2022] aiming at going beyond monotonicity by considering the weaker negative comonotonicity assumption. In particular, we provide tight complexity analyses for the Proximal Point, Extragradient, and Optimistic Gradient methods in this setup, closing some questions on their working guarantees beyond monotonicity.
△ Less
Submitted 18 July, 2023; v1 submitted 25 October, 2022;
originally announced October 2022.
-
Quadratic minimization: from conjugate gradient to an adaptive Heavy-ball method with Polyak step-sizes
Authors:
Baptiste Goujaud,
Adrien Taylor,
Aymeric Dieuleveut
Abstract:
In this work, we propose an adaptive variation on the classical Heavy-ball method for convex quadratic minimization. The adaptivity crucially relies on so-called "Polyak step-sizes", which consists in using the knowledge of the optimal value of the optimization problem at hand instead of problem parameters such as a few eigenvalues of the Hessian of the problem. This method happens to also be equi…
▽ More
In this work, we propose an adaptive variation on the classical Heavy-ball method for convex quadratic minimization. The adaptivity crucially relies on so-called "Polyak step-sizes", which consists in using the knowledge of the optimal value of the optimization problem at hand instead of problem parameters such as a few eigenvalues of the Hessian of the problem. This method happens to also be equivalent to a variation of the classical conjugate gradient method, and thereby inherits many of its attractive features, including its finite-time convergence, instance optimality, and its worst-case convergence rates.
The classical gradient method with Polyak step-sizes is known to behave very well in situations in which it can be used, and the question of whether incorporating momentum in this method is possible and can improve the method itself appeared to be open. We provide a definitive answer to this question for minimizing convex quadratic functions, a arguably necessary first step for developing such methods in more general setups.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
Optimal first-order methods for convex functions with a quadratic upper bound
Authors:
Baptiste Goujaud,
Adrien Taylor,
Aymeric Dieuleveut
Abstract:
We analyze worst-case convergence guarantees of first-order optimization methods over a function class extending that of smooth and convex functions. This class contains convex functions that admit a simple quadratic upper bound. Its study is motivated by its stability under minor perturbations. We provide a thorough analysis of first-order methods, including worst-case convergence guarantees for…
▽ More
We analyze worst-case convergence guarantees of first-order optimization methods over a function class extending that of smooth and convex functions. This class contains convex functions that admit a simple quadratic upper bound. Its study is motivated by its stability under minor perturbations. We provide a thorough analysis of first-order methods, including worst-case convergence guarantees for several algorithms, and demonstrate that some of them achieve the optimal worst-case guarantee over the class. We support our analysis by numerical validation of worst-case guarantees using performance estimation problems. A few observations can be drawn from this analysis, particularly regarding the optimality (resp. and adaptivity) of the heavy-ball method (resp. heavy-ball with line-search). Finally, we show how our analysis can be leveraged to obtain convergence guarantees over more complex classes of functions. Overall, this study brings insights on the choice of function classes over which standard first-order methods have working worst-case guarantees.
△ Less
Submitted 30 May, 2022;
originally announced May 2022.
-
A systematic approach to Lyapunov analyses of continuous-time models in convex optimization
Authors:
Céline Moucer,
Adrien Taylor,
Francis Bach
Abstract:
First-order methods are often analyzed via their continuous-time models, where their worst-case convergence properties are usually approached via Lyapunov functions. In this work, we provide a systematic and principled approach to find and verify Lyapunov functions for classes of ordinary and stochastic differential equations. More precisely, we extend the performance estimation framework, origina…
▽ More
First-order methods are often analyzed via their continuous-time models, where their worst-case convergence properties are usually approached via Lyapunov functions. In this work, we provide a systematic and principled approach to find and verify Lyapunov functions for classes of ordinary and stochastic differential equations. More precisely, we extend the performance estimation framework, originally proposed by Drori and Teboulle [10], to continuous-time models. We retrieve convergence results comparable to those of discrete methods using fewer assumptions and convexity inequalities, and provide new results for stochastic accelerated gradient flows.
△ Less
Submitted 11 March, 2024; v1 submitted 25 May, 2022;
originally announced May 2022.
-
Fast Stochastic Composite Minimization and an Accelerated Frank-Wolfe Algorithm under Parallelization
Authors:
Benjamin Dubois-Taine,
Francis Bach,
Quentin Berthet,
Adrien Taylor
Abstract:
We consider the problem of minimizing the sum of two convex functions. One of those functions has Lipschitz-continuous gradients, and can be accessed via stochastic oracles, whereas the other is "simple". We provide a Bregman-type algorithm with accelerated convergence in function values to a ball containing the minimum. The radius of this ball depends on problem-dependent constants, including the…
▽ More
We consider the problem of minimizing the sum of two convex functions. One of those functions has Lipschitz-continuous gradients, and can be accessed via stochastic oracles, whereas the other is "simple". We provide a Bregman-type algorithm with accelerated convergence in function values to a ball containing the minimum. The radius of this ball depends on problem-dependent constants, including the variance of the stochastic oracle. We further show that this algorithmic setup naturally leads to a variant of Frank-Wolfe achieving acceleration under parallelization. More precisely, when minimizing a smooth convex function on a bounded domain, we show that one can achieve an $ε$ primal-dual gap (in expectation) in $\tilde{O}(1/ \sqrtε)$ iterations, by only accessing gradients of the original function and a linear maximization oracle with $O(1/\sqrtε)$ computing units in parallel. We illustrate this fast convergence on synthetic numerical experiments.
△ Less
Submitted 25 November, 2024; v1 submitted 25 May, 2022;
originally announced May 2022.
-
Last-Iterate Convergence of Optimistic Gradient Method for Monotone Variational Inequalities
Authors:
Eduard Gorbunov,
Adrien Taylor,
Gauthier Gidel
Abstract:
The Past Extragradient (PEG) [Popov, 1980] method, also known as the Optimistic Gradient method, has known a recent gain in interest in the optimization community with the emergence of variational inequality formulations for machine learning. Recently, in the unconstrained case, Golowich et al. [2020] proved that a $O(1/N)$ last-iterate convergence rate in terms of the squared norm of the operator…
▽ More
The Past Extragradient (PEG) [Popov, 1980] method, also known as the Optimistic Gradient method, has known a recent gain in interest in the optimization community with the emergence of variational inequality formulations for machine learning. Recently, in the unconstrained case, Golowich et al. [2020] proved that a $O(1/N)$ last-iterate convergence rate in terms of the squared norm of the operator can be achieved for Lipschitz and monotone operators with a Lipschitz Jacobian. In this work, by introducing a novel analysis through potential functions, we show that (i) this $O(1/N)$ last-iterate convergence can be achieved without any assumption on the Jacobian of the operator, and (ii) it can be extended to the constrained case, which was not derived before even under Lipschitzness of the Jacobian. The proof is significantly different from the one known from Golowich et al. [2020], and its discovery was computer-aided. Those results close the open question of the last iterate convergence of PEG for monotone variational inequalities.
△ Less
Submitted 31 October, 2022; v1 submitted 17 May, 2022;
originally announced May 2022.
-
PEPit: computer-assisted worst-case analyses of first-order optimization methods in Python
Authors:
Baptiste Goujaud,
Céline Moucer,
François Glineur,
Julien Hendrickx,
Adrien Taylor,
Aymeric Dieuleveut
Abstract:
PEPit is a Python package aiming at simplifying the access to worst-case analyses of a large family of first-order optimization methods possibly involving gradient, projection, proximal, or linear optimization oracles, along with their approximate, or Bregman variants. In short, PEPit is a package enabling computer-assisted worst-case analyses of first-order optimization methods. The key underlyin…
▽ More
PEPit is a Python package aiming at simplifying the access to worst-case analyses of a large family of first-order optimization methods possibly involving gradient, projection, proximal, or linear optimization oracles, along with their approximate, or Bregman variants. In short, PEPit is a package enabling computer-assisted worst-case analyses of first-order optimization methods. The key underlying idea is to cast the problem of performing a worst-case analysis, often referred to as a performance estimation problem (PEP), as a semidefinite program (SDP) which can be solved numerically. To do that, the package users are only required to write first-order methods nearly as they would have implemented them. The package then takes care of the SDP modeling parts, and the worst-case analysis is performed numerically via a standard solver.
△ Less
Submitted 17 June, 2024; v1 submitted 11 January, 2022;
originally announced January 2022.
-
A note on approximate accelerated forward-backward methods with absolute and relative errors, and possibly strongly convex objectives
Authors:
Mathieu Barré,
Adrien Taylor,
Francis Bach
Abstract:
In this short note, we provide a simple version of an accelerated forward-backward method (a.k.a. Nesterov's accelerated proximal gradient method) possibly relying on approximate proximal operators and allowing to exploit strong convexity of the objective function. The method supports both relative and absolute errors, and its behavior is illustrated on a set of standard numerical experiments. Usi…
▽ More
In this short note, we provide a simple version of an accelerated forward-backward method (a.k.a. Nesterov's accelerated proximal gradient method) possibly relying on approximate proximal operators and allowing to exploit strong convexity of the objective function. The method supports both relative and absolute errors, and its behavior is illustrated on a set of standard numerical experiments. Using the same developments, we further provide a version of the accelerated proximal hybrid extragradient method of Monteiro and Svaiter (2013) possibly exploiting strong convexity of the objective function.
△ Less
Submitted 21 January, 2022; v1 submitted 29 June, 2021;
originally announced June 2021.
-
Super-Acceleration with Cyclical Step-sizes
Authors:
Baptiste Goujaud,
Damien Scieur,
Aymeric Dieuleveut,
Adrien Taylor,
Fabian Pedregosa
Abstract:
We develop a convergence-rate analysis of momentum with cyclical step-sizes. We show that under some assumption on the spectral gap of Hessians in machine learning, cyclical step-sizes are provably faster than constant step-sizes. More precisely, we develop a convergence rate analysis for quadratic objectives that provides optimal parameters and shows that cyclical learning rates can improve upon…
▽ More
We develop a convergence-rate analysis of momentum with cyclical step-sizes. We show that under some assumption on the spectral gap of Hessians in machine learning, cyclical step-sizes are provably faster than constant step-sizes. More precisely, we develop a convergence rate analysis for quadratic objectives that provides optimal parameters and shows that cyclical learning rates can improve upon traditional lower complexity bounds. We further propose a systematic approach to design optimal first order methods for quadratic minimization with a given spectral structure. Finally, we provide a local convergence rate analysis beyond quadratic minimization for the proposed methods and illustrate our findings through benchmarks on least squares and logistic regression problems.
△ Less
Submitted 9 May, 2022; v1 submitted 17 June, 2021;
originally announced June 2021.
-
A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip
Authors:
Mathieu Even,
Raphaël Berthier,
Francis Bach,
Nicolas Flammarion,
Pierre Gaillard,
Hadrien Hendrikx,
Laurent Massoulié,
Adrien Taylor
Abstract:
We introduce the continuized Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter. The two variables continuously mix following a linear ordinary differential equation and take gradient steps at random times. This continuized variant benefits from the best of the continuous and the discrete frameworks: as a continuous process, o…
▽ More
We introduce the continuized Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter. The two variables continuously mix following a linear ordinary differential equation and take gradient steps at random times. This continuized variant benefits from the best of the continuous and the discrete frameworks: as a continuous process, one can use differential calculus to analyze convergence and obtain analytical expressions for the parameters; and a discretization of the continuized process can be computed exactly with convergence rates similar to those of Nesterov original acceleration. We show that the discretization has the same structure as Nesterov acceleration, but with random parameters. We provide continuized Nesterov acceleration under deterministic as well as stochastic gradients, with either additive or multiplicative noise. Finally, using our continuized framework and expressing the gossip averaging problem as the stochastic minimization of a certain energy function, we provide the first rigorous acceleration of asynchronous gossip algorithms.
△ Less
Submitted 27 October, 2021; v1 submitted 10 June, 2021;
originally announced June 2021.
-
An optimal gradient method for smooth strongly convex minimization
Authors:
Adrien Taylor,
Yoel Drori
Abstract:
We present an optimal gradient method for smooth strongly convex optimization. The method is optimal in the sense that its worst-case bound on the distance to an optimal point exactly matches the lower bound on the oracle complexity for the class of problems, meaning that no black-box first-order method can have a better worst-case guarantee without further assumptions on the class of problems at…
▽ More
We present an optimal gradient method for smooth strongly convex optimization. The method is optimal in the sense that its worst-case bound on the distance to an optimal point exactly matches the lower bound on the oracle complexity for the class of problems, meaning that no black-box first-order method can have a better worst-case guarantee without further assumptions on the class of problems at hand. In addition, we provide a constructive recipe for obtaining the algorithmic parameters of the method and illustrate that it can be used for deriving methods for other optimality criteria as well.
△ Less
Submitted 14 June, 2022; v1 submitted 24 January, 2021;
originally announced January 2021.
-
On the oracle complexity of smooth strongly convex minimization
Authors:
Yoel Drori,
Adrien Taylor
Abstract:
We construct a family of functions suitable for establishing lower bounds on the oracle complexity of first-order minimization of smooth strongly-convex functions. Based on this construction, we derive new lower bounds on the complexity of strongly-convex minimization under various inaccuracy criteria. The new bounds match the known upper bounds up to a constant factor, and when the inaccuracy of…
▽ More
We construct a family of functions suitable for establishing lower bounds on the oracle complexity of first-order minimization of smooth strongly-convex functions. Based on this construction, we derive new lower bounds on the complexity of strongly-convex minimization under various inaccuracy criteria. The new bounds match the known upper bounds up to a constant factor, and when the inaccuracy of a solution is measured by its distance to the solution set, the new lower bound exactly matches the upper bound obtained by the recent Information-Theoretic Exact Method by the same authors, thereby establishing the exact oracle complexity for this class of problems.
△ Less
Submitted 14 June, 2021; v1 submitted 24 January, 2021;
originally announced January 2021.
-
Acceleration Methods
Authors:
Alexandre d'Aspremont,
Damien Scieur,
Adrien Taylor
Abstract:
This monograph covers some recent advances in a range of acceleration techniques frequently used in convex optimization. We first use quadratic optimization problems to introduce two key families of methods, namely momentum and nested optimization schemes. They coincide in the quadratic case to form the Chebyshev method. We discuss momentum methods in detail, starting with the seminal work of Nest…
▽ More
This monograph covers some recent advances in a range of acceleration techniques frequently used in convex optimization. We first use quadratic optimization problems to introduce two key families of methods, namely momentum and nested optimization schemes. They coincide in the quadratic case to form the Chebyshev method. We discuss momentum methods in detail, starting with the seminal work of Nesterov and structure convergence proofs using a few master templates, such as that for optimized gradient methods, which provide the key benefit of showing how momentum methods optimize convergence guarantees. We further cover proximal acceleration, at the heart of the Catalyst and Accelerated Hybrid Proximal Extragradient frameworks, using similar algorithmic patterns. Common acceleration techniques rely directly on the knowledge of some of the regularity parameters in the problem at hand. We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates while adapting to unobserved regularity parameters.
△ Less
Submitted 24 September, 2024; v1 submitted 23 January, 2021;
originally announced January 2021.
-
Convergence of Constrained Anderson Acceleration
Authors:
Mathieu Barré,
Adrien Taylor,
Alexandre d'Aspremont
Abstract:
We prove non asymptotic linear convergence rates for the constrained Anderson acceleration extrapolation scheme. These guarantees come from new upper bounds on the constrained Chebyshev problem, which consists in minimizing the maximum absolute value of a polynomial on a bounded real interval with $l_1$ constraints on its coefficients vector. Constrained Anderson Acceleration has a numerical cost…
▽ More
We prove non asymptotic linear convergence rates for the constrained Anderson acceleration extrapolation scheme. These guarantees come from new upper bounds on the constrained Chebyshev problem, which consists in minimizing the maximum absolute value of a polynomial on a bounded real interval with $l_1$ constraints on its coefficients vector. Constrained Anderson Acceleration has a numerical cost comparable to that of the original scheme.
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
Principled Analyses and Design of First-Order Methods with Inexact Proximal Operators
Authors:
Mathieu Barré,
Adrien Taylor,
Francis Bach
Abstract:
Proximal operations are among the most common primitives appearing in both practical and theoretical (or high-level) optimization methods. This basic operation typically consists in solving an intermediary (hopefully simpler) optimization problem. In this work, we survey notions of inaccuracies that can be used when solving those intermediary optimization problems. Then, we show that worst-case gu…
▽ More
Proximal operations are among the most common primitives appearing in both practical and theoretical (or high-level) optimization methods. This basic operation typically consists in solving an intermediary (hopefully simpler) optimization problem. In this work, we survey notions of inaccuracies that can be used when solving those intermediary optimization problems. Then, we show that worst-case guarantees for algorithms relying on such inexact proximal operations can be systematically obtained through a generic procedure based on semidefinite programming. This methodology is primarily based on the approach introduced by Drori and Teboulle (2014) and on convex interpolation results, and allows producing non-improvable worst-case analyzes. In other words, for a given algorithm, the methodology generates both worst-case certificates (i.e., proofs) and problem instances on which those bounds are achieved.
Relying on this methodology, we study numerical worst-case performances of a few basic methods relying on inexact proximal operations including accelerated variants, and design a variant with optimized worst-case behaviour. We further illustrate how to extend the approach to support strongly convex objectives by studying a simple relatively inexact proximal minimization method.
△ Less
Submitted 29 June, 2021; v1 submitted 10 June, 2020;
originally announced June 2020.
-
Complexity Guarantees for Polyak Steps with Momentum
Authors:
Mathieu Barré,
Adrien Taylor,
Alexandre d'Aspremont
Abstract:
In smooth strongly convex optimization, knowledge of the strong convexity parameter is critical for obtaining simple methods with accelerated rates. In this work, we study a class of methods, based on Polyak steps, where this knowledge is substituted by that of the optimal value, $f_*$. We first show slightly improved convergence bounds than previously known for the classical case of simple gradie…
▽ More
In smooth strongly convex optimization, knowledge of the strong convexity parameter is critical for obtaining simple methods with accelerated rates. In this work, we study a class of methods, based on Polyak steps, where this knowledge is substituted by that of the optimal value, $f_*$. We first show slightly improved convergence bounds than previously known for the classical case of simple gradient descent with Polyak steps, we then derive an accelerated gradient method with Polyak steps and momentum, along with convergence guarantees.
△ Less
Submitted 3 July, 2020; v1 submitted 3 February, 2020;
originally announced February 2020.
-
Optimal Complexity and Certification of Bregman First-Order Methods
Authors:
Radu-Alexandru Dragomir,
Adrien Taylor,
Alexandre d'Aspremont,
Jérôme Bolte
Abstract:
We provide a lower bound showing that the $O(1/k)$ convergence rate of the NoLips method (a.k.a. Bregman Gradient) is optimal for the class of functions satisfying the $h$-smoothness assumption. This assumption, also known as relative smoothness, appeared in the recent developments around the Bregman Gradient method, where acceleration remained an open issue. On the way, we show how to constructiv…
▽ More
We provide a lower bound showing that the $O(1/k)$ convergence rate of the NoLips method (a.k.a. Bregman Gradient) is optimal for the class of functions satisfying the $h$-smoothness assumption. This assumption, also known as relative smoothness, appeared in the recent developments around the Bregman Gradient method, where acceleration remained an open issue. On the way, we show how to constructively obtain the corresponding worst-case functions by extending the computer-assisted performance estimation framework of Drori and Teboulle (Mathematical Programming, 2014) to Bregman first-order methods, and to handle the classes of differentiable and strictly convex functions.
△ Less
Submitted 17 February, 2021; v1 submitted 19 November, 2019;
originally announced November 2019.
-
Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions
Authors:
Adrien Taylor,
Francis Bach
Abstract:
We provide a novel computer-assisted technique for systematically analyzing first-order methods for optimization. In contrast with previous works, the approach is particularly suited for handling sublinear convergence rates and stochastic oracles. The technique relies on semidefinite programming and potential functions. It allows simultaneously obtaining worst-case guarantees on the behavior of th…
▽ More
We provide a novel computer-assisted technique for systematically analyzing first-order methods for optimization. In contrast with previous works, the approach is particularly suited for handling sublinear convergence rates and stochastic oracles. The technique relies on semidefinite programming and potential functions. It allows simultaneously obtaining worst-case guarantees on the behavior of those algorithms, and assisting in choosing appropriate parameters for tuning their worst-case performances. The technique also benefits from comfortable tightness guarantees, meaning that unsatisfactory results can be improved only by changing the setting. We use the approach for analyzing deterministic and stochastic first-order methods under different assumptions on the nature of the stochastic noise. Among others, we treat unstructured noise with bounded variance, different noise models arising in over-parametrized expectation minimization problems, and randomized block-coordinate descent schemes.
△ Less
Submitted 21 December, 2021; v1 submitted 3 February, 2019;
originally announced February 2019.
-
Operator Splitting Performance Estimation: Tight contraction factors and optimal parameter selection
Authors:
Ernest K. Ryu,
Adrien B. Taylor,
Carolina Bergeling,
Pontus Giselsson
Abstract:
We propose a methodology for studying the performance of common splitting methods through semidefinite programming. We prove tightness of the methodology and demonstrate its value by presenting two applications of it. First, we use the methodology as a tool for computer-assisted proofs to prove tight analytical contraction factors for Douglas--Rachford splitting that are likely too complicated for…
▽ More
We propose a methodology for studying the performance of common splitting methods through semidefinite programming. We prove tightness of the methodology and demonstrate its value by presenting two applications of it. First, we use the methodology as a tool for computer-assisted proofs to prove tight analytical contraction factors for Douglas--Rachford splitting that are likely too complicated for a human to find bare-handed. Second, we use the methodology as an algorithmic tool to computationally select the optimal splitting method parameters by solving a series of semidefinite programs.
△ Less
Submitted 30 April, 2020; v1 submitted 1 December, 2018;
originally announced December 2018.
-
Lyapunov Functions for First-Order Methods: Tight Automated Convergence Guarantees
Authors:
Adrien Taylor,
Bryan Van Scoy,
Laurent Lessard
Abstract:
We present a novel way of generating Lyapunov functions for proving linear convergence rates of first-order optimization methods. Our approach provably obtains the fastest linear convergence rate that can be verified by a quadratic Lyapunov function (with given states), and only relies on solving a small-sized semidefinite program. Our approach combines the advantages of performance estimation pro…
▽ More
We present a novel way of generating Lyapunov functions for proving linear convergence rates of first-order optimization methods. Our approach provably obtains the fastest linear convergence rate that can be verified by a quadratic Lyapunov function (with given states), and only relies on solving a small-sized semidefinite program. Our approach combines the advantages of performance estimation problems (PEP, due to Drori & Teboulle (2014)) and integral quadratic constraints (IQC, due to Lessard et al. (2016)), and relies on convex interpolation (due to Taylor et al. (2017c;b)).
△ Less
Submitted 11 June, 2018; v1 submitted 16 March, 2018;
originally announced March 2018.
-
Efficient First-order Methods for Convex Minimization: a Constructive Approach
Authors:
Yoel Drori,
Adrien B. Taylor
Abstract:
We describe a novel constructive technique for devising efficient first-order methods for a wide range of large-scale convex minimization settings, including smooth, non-smooth, and strongly convex minimization. The technique builds upon a certain variant of the conjugate gradient method to construct a family of methods such that a) all methods in the family share the same worst-case guarantee as…
▽ More
We describe a novel constructive technique for devising efficient first-order methods for a wide range of large-scale convex minimization settings, including smooth, non-smooth, and strongly convex minimization. The technique builds upon a certain variant of the conjugate gradient method to construct a family of methods such that a) all methods in the family share the same worst-case guarantee as the base conjugate gradient method, and b) the family includes a fixed-step first-order method. We demonstrate the effectiveness of the approach by deriving optimal methods for the smooth and non-smooth cases, including new methods that forego knowledge of the problem parameters at the cost of a one-dimensional line search per iteration, and a universal method for the union of these classes that requires a three-dimensional search per iteration. In the strongly convex case, we show how numerical tools can be used to perform the construction, and show that the resulting method offers an improved worst-case bound compared to Nesterov's celebrated fast gradient method.
△ Less
Submitted 26 June, 2019; v1 submitted 15 March, 2018;
originally announced March 2018.
-
Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation
Authors:
Etienne de Klerk,
Francois Glineur,
Adrien Taylor
Abstract:
We provide new tools for worst-case performance analysis of the gradient (or steepest descent) method of Cauchy for smooth strongly convex functions, and Newton's method for self-concordant functions, including the case of inexact search directions. The analysis uses semidefinite programming performance estimation, as pioneered by Drori and Teboulle [Mathematical Programming, 145(1-2):451-482, 201…
▽ More
We provide new tools for worst-case performance analysis of the gradient (or steepest descent) method of Cauchy for smooth strongly convex functions, and Newton's method for self-concordant functions, including the case of inexact search directions. The analysis uses semidefinite programming performance estimation, as pioneered by Drori and Teboulle [Mathematical Programming, 145(1-2):451-482, 2014], and extends recent performance estimation results for the method of Cauchy by the authors [Optimization Letters, 11(7), 1185-1199, 2017]. To illustrate the applicability of the tools, we demonstrate a novel complexity analysis of short step interior point methods using inexact search directions. As an example in this framework, we sketch how to give a rigorous worst-case complexity analysis of a recent interior point method by Abernethy and Hazan [PMLR, 48:2520-2528, 2016].
△ Less
Submitted 22 June, 2020; v1 submitted 15 September, 2017;
originally announced September 2017.
-
Exact worst-case convergence rates of the proximal gradient method for composite convex minimization
Authors:
Adrien B. Taylor,
Julien M. Hendrickx,
François Glineur
Abstract:
We study the worst-case convergence rates of the proximal gradient method for minimizing the sum of a smooth strongly convex function and a non-smooth convex function whose proximal operator is available.
We establish the exact worst-case convergence rates of the proximal gradient method in this setting for any step size and for different standard performance measures: objective function accurac…
▽ More
We study the worst-case convergence rates of the proximal gradient method for minimizing the sum of a smooth strongly convex function and a non-smooth convex function whose proximal operator is available.
We establish the exact worst-case convergence rates of the proximal gradient method in this setting for any step size and for different standard performance measures: objective function accuracy, distance to optimality and residual gradient norm.
The proof methodology relies on recent developments in performance estimation of first-order methods based on semidefinite programming. In the case of the proximal gradient method, this methodology allows obtaining exact and non-asymptotic worst-case guarantees that are conceptually very simple, although apparently new.
On the way, we discuss how strong convexity can be replaced by weaker assumptions, while preserving the corresponding convergence rates. We also establish that the same fixed step size policy is optimal for all three performance measures. Finally, we extend recent results on the worst-case behavior of gradient descent with exact line search to the proximal case.
△ Less
Submitted 29 February, 2020; v1 submitted 11 May, 2017;
originally announced May 2017.
-
Approximate Likelihood Construction for Rough Differential Equations
Authors:
Anastasia Papavasiliou,
Kasia B. Taylor
Abstract:
The paper is split in two parts: in the first part, we construct the exact likelihood for a discretely observed rough differential equation, driven by a piecewise linear path. In the second part, we use this likelihood in order to construct an approximation of the likelihood for a discretely observed differential equation driven by a general class of rough paths. Finally, we study the behaviour of…
▽ More
The paper is split in two parts: in the first part, we construct the exact likelihood for a discretely observed rough differential equation, driven by a piecewise linear path. In the second part, we use this likelihood in order to construct an approximation of the likelihood for a discretely observed differential equation driven by a general class of rough paths. Finally, we study the behaviour of the approximate likelihood when the sampling frequency tends to infinity.
△ Less
Submitted 9 July, 2018; v1 submitted 8 December, 2016;
originally announced December 2016.
-
On the worst-case complexity of the gradient method with exact line search for smooth strongly convex functions
Authors:
Etienne de Klerk,
François Glineur,
Adrien B. Taylor
Abstract:
We consider the gradient (or steepest) descent method with exact line search applied to a strongly convex function with Lipschitz continuous gradient. We establish the exact worst-case rate of convergence of this scheme, and show that this worst-case behavior is exhibited by a certain convex quadratic function. We also give the tight worst-case complexity bound for a noisy variant of gradient desc…
▽ More
We consider the gradient (or steepest) descent method with exact line search applied to a strongly convex function with Lipschitz continuous gradient. We establish the exact worst-case rate of convergence of this scheme, and show that this worst-case behavior is exhibited by a certain convex quadratic function. We also give the tight worst-case complexity bound for a noisy variant of gradient descent method, where exact line-search is performed in a search direction that differs from negative gradient by at most a prescribed relative tolerance.
The proofs are computer-assisted, and rely on the resolutions of semidefinite programming performance estimation problems as introduced in the paper [Y. Drori and M. Teboulle. Performance of first-order methods for smooth convex minimization: a novel approach. Mathematical Programming, 145(1-2):451-482, 2014].
△ Less
Submitted 15 September, 2016; v1 submitted 30 June, 2016;
originally announced June 2016.
-
Exact Worst-case Performance of First-order Methods for Composite Convex Optimization
Authors:
Adrien B. Taylor,
Julien M. Hendrickx,
François Glineur
Abstract:
We provide a framework for computing the exact worst-case performance of any algorithm belonging to a broad class of oracle-based first-order methods for composite convex optimization, including those performing explicit, projected, proximal, conditional and inexact (sub)gradient steps. We simultaneously obtain tight worst-case guarantees and explicit instances of optimization problems on which th…
▽ More
We provide a framework for computing the exact worst-case performance of any algorithm belonging to a broad class of oracle-based first-order methods for composite convex optimization, including those performing explicit, projected, proximal, conditional and inexact (sub)gradient steps. We simultaneously obtain tight worst-case guarantees and explicit instances of optimization problems on which the algorithm reaches this worst-case. We achieve this by reducing the computation of the worst-case to solving a convex semidefinite program, generalizing previous works on performance estimation by Drori and Teboulle [13] and the authors [43]. We use these developments to obtain a tighter analysis of the proximal point algorithm and of several variants of fast proximal gradient, conditional gradient, subgradient and alternating projection methods. In particular, we present a new analytical worst-case guarantee for the proximal point algorithm that is twice better than previously known, and improve the standard worst-case guarantee for the conditional gradient method by more than a factor of two. We also show how the optimized gradient method proposed by Kim and Fessler in [22] can be extended by incorporating a projection or a proximal operator, which leads to an algorithm that converges in the worst-case twice as fast as the standard accelerated proximal gradient method [2].
△ Less
Submitted 21 November, 2019; v1 submitted 23 December, 2015;
originally announced December 2015.
-
Exact sampling of diffusions with a discontinuity in the drift
Authors:
Omiros Papaspiliopoulos,
Gareth O. Roberts,
Kasia B. Taylor
Abstract:
We introduce exact methods for the simulation of sample paths of one-dimensional diffusions with a discontinuity in the drift function. Our procedures require the simulation of finite-dimensional candidate draws from probability laws related to those of Brownian motion and its local time and are based on the principle of retrospective rejection sampling. A simple illustration is provided.
We introduce exact methods for the simulation of sample paths of one-dimensional diffusions with a discontinuity in the drift function. Our procedures require the simulation of finite-dimensional candidate draws from probability laws related to those of Brownian motion and its local time and are based on the principle of retrospective rejection sampling. A simple illustration is provided.
△ Less
Submitted 12 November, 2015;
originally announced November 2015.
-
Smooth Strongly Convex Interpolation and Exact Worst-case Performance of First-order Methods
Authors:
Adrien B. Taylor,
Julien M. Hendrickx,
François Glineur
Abstract:
We show that the exact worst-case performance of fixed-step first-order methods for unconstrained optimization of smooth (possibly strongly) convex functions can be obtained by solving convex programs.
Finding the worst-case performance of a black-box first-order method is formulated as an optimization problem over a set of smooth (strongly) convex functions and initial conditions. We develop cl…
▽ More
We show that the exact worst-case performance of fixed-step first-order methods for unconstrained optimization of smooth (possibly strongly) convex functions can be obtained by solving convex programs.
Finding the worst-case performance of a black-box first-order method is formulated as an optimization problem over a set of smooth (strongly) convex functions and initial conditions. We develop closed-form necessary and sufficient conditions for smooth (strongly) convex interpolation, which provide a finite representation for those functions. This allows us to reformulate the worst-case performance estimation problem as an equivalent finite dimension-independent semidefinite optimization problem, whose exact solution can be recovered up to numerical precision. Optimal solutions to this performance estimation problem provide both worst-case performance bounds and explicit functions matching them, as our smooth (strongly) convex interpolation procedure is constructive.
Our works build on those of Drori and Teboulle in [Math. Prog. 145 (1-2), 2014] who introduced and solved relaxations of the performance estimation problem for smooth convex functions.
We apply our approach to different fixed-step first-order methods with several performance criteria, including objective function accuracy and gradient norm. We conjecture several numerically supported worst-case bounds on the performance of the fixed-step gradient, fast gradient and optimized gradient methods, both in the smooth convex and the smooth strongly convex cases, and deduce tight estimates of the optimal step size for the gradient method.
△ Less
Submitted 31 October, 2016; v1 submitted 19 February, 2015;
originally announced February 2015.
-
Classifying closed 2-orbifolds with Euler characteristics
Authors:
Whitney DuVal,
John Schulte,
Christopher Seaton,
Bradford Taylor
Abstract:
We determine the extent to which the collection of $Γ$-Euler-Satake characteristics classify closed 2-orbifolds. In particular, we show that the closed, connected, effective, orientable 2-orbifolds are classified by the collection of $Γ$-Euler-Satake characteristics corresponding to free or free abelian $Γ$ and are not classified by those corresponding to any finite collection of finitely genera…
▽ More
We determine the extent to which the collection of $Γ$-Euler-Satake characteristics classify closed 2-orbifolds. In particular, we show that the closed, connected, effective, orientable 2-orbifolds are classified by the collection of $Γ$-Euler-Satake characteristics corresponding to free or free abelian $Γ$ and are not classified by those corresponding to any finite collection of finitely generated discrete groups. Similarly, we show that such a classification is not possible for non-orientable 2-orbifolds and any collection of $Γ$, nor for noneffective 2-orbifolds. As a corollary, we generate families of orbifolds with the same $Γ$-Euler-Satake characteristics in arbitrary dimensions for any finite collection of $Γ$; this is used to demonstrate that the $Γ$-Euler-Satake characteristics each constitute new invariants of orbifolds.
△ Less
Submitted 12 February, 2009;
originally announced February 2009.
-
Umbral presentations for polynomial sequences
Authors:
Brian D. Taylor
Abstract:
Using random variables as motivation, this paper presents an exposition of the formalisms developed by Rota and Taylor for the classical umbral calculus. A variety of examples are presented, culminating in several descriptions of sequences of binomial type in terms of umbral polynomials.
Using random variables as motivation, this paper presents an exposition of the formalisms developed by Rota and Taylor for the classical umbral calculus. A variety of examples are presented, culminating in several descriptions of sequences of binomial type in terms of umbral polynomials.
△ Less
Submitted 24 August, 1999;
originally announced August 1999.
-
A straightening algorithm for row-convex tableaux
Authors:
Brian D. Taylor
Abstract:
We produce a new basis for the Schur and Weyl modules associated to a row-convex shape, D. The basis is indexed by new class of "straight" tableaux which we introduce by weakening the usual requirements for standard tableaux. Spanning is proved via a new straightening algorithm for expanding elements of the representation into this basis. For skew shapes, this algorithm specializes to the classi…
▽ More
We produce a new basis for the Schur and Weyl modules associated to a row-convex shape, D. The basis is indexed by new class of "straight" tableaux which we introduce by weakening the usual requirements for standard tableaux. Spanning is proved via a new straightening algorithm for expanding elements of the representation into this basis. For skew shapes, this algorithm specializes to the classical straightening law. The new straight basis is used to produce bases for flagged Schur and Weyl modules, to provide Groebner and sagbi bases for the homogeneous coordinate rings of some configuration varieties and to produce a flagged branching rule for row-convex representations. Systematic use of supersymmetric letterplace techniques enables the representation theoretic results to be applied to representations of the general linear Lie superalgebra as well as to the general linear group.
△ Less
Submitted 24 August, 1999;
originally announced August 1999.