-
A strong second-order sequential optimality condition for nonlinear programming problems
Authors:
Huimin Li,
Yuya Yamakawa,
Ellen H. Fukuda,
Nobuo Yamashita
Abstract:
Most numerical methods developed for solving nonlinear programming problems are designed to find points that satisfy certain optimality conditions. While the Karush-Kuhn-Tucker conditions are well-known, they become invalid when constraint qualifications (CQ) are not met. Recent advances in sequential optimality conditions address this limitation in both first- and second-order cases, providing ge…
▽ More
Most numerical methods developed for solving nonlinear programming problems are designed to find points that satisfy certain optimality conditions. While the Karush-Kuhn-Tucker conditions are well-known, they become invalid when constraint qualifications (CQ) are not met. Recent advances in sequential optimality conditions address this limitation in both first- and second-order cases, providing genuine optimality guarantees at local optima, even when CQs do not hold. However, some second-order sequential optimality conditions still require some restrictive conditions on constraints in the recent literature. In this paper, we propose a new strong second-order sequential optimality condition without CQs. We also show that a penalty-type method and an augmented Lagrangian method generate points satisfying these new optimality conditions.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
Convergence analysis of a regularized Newton method with generalized regularization terms for convex optimization problems
Authors:
Yuya Yamakawa,
Nobuo Yamashita
Abstract:
This paper presents a regularized Newton method (RNM) with generalized regularization terms for unconstrained convex optimization problems. The generalized regularization includes quadratic, cubic, and elastic net regularizations as special cases. Therefore, the proposed method serves as a general framework that includes not only the classical and cubic RNMs but also a novel RNM with elastic net r…
▽ More
This paper presents a regularized Newton method (RNM) with generalized regularization terms for unconstrained convex optimization problems. The generalized regularization includes quadratic, cubic, and elastic net regularizations as special cases. Therefore, the proposed method serves as a general framework that includes not only the classical and cubic RNMs but also a novel RNM with elastic net regularization. We show that the proposed RNM has the global $\mathcal{O}(k^{-2})$ and local superlinear convergence, which are the same as those of the cubic RNM.
△ Less
Submitted 10 July, 2024; v1 submitted 14 June, 2024;
originally announced June 2024.
-
A Stochastic Variance Reduced Gradient using Barzilai-Borwein Techniques as Second Order Information
Authors:
Hardik Tankaria,
Nobuo Yamashita
Abstract:
In this paper, we consider to improve the stochastic variance reduce gradient (SVRG) method via incorporating the curvature information of the objective function. We propose to reduce the variance of stochastic gradients using the computationally efficient Barzilai-Borwein (BB) method by incorporating it into the SVRG. We also incorporate a BB-step size as its variant. We prove its linear converge…
▽ More
In this paper, we consider to improve the stochastic variance reduce gradient (SVRG) method via incorporating the curvature information of the objective function. We propose to reduce the variance of stochastic gradients using the computationally efficient Barzilai-Borwein (BB) method by incorporating it into the SVRG. We also incorporate a BB-step size as its variant. We prove its linear convergence theorem that works not only for the proposed method but also for the other existing variants of SVRG with second-order information. We conduct the numerical experiments on the benchmark datasets and show that the proposed method with constant step size performs better than the existing variance reduced methods for some test problems.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
Monotonicity for Multiobjective Accelerated Proximal Gradient Methods
Authors:
Yuki Nishimura,
Ellen H. Fukuda,
Nobuo Yamashita
Abstract:
Accelerated proximal gradient methods, which are also called fast iterative shrinkage-thresholding algorithms (FISTA) are known to be efficient for many applications. Recently, Tanabe et al. proposed an extension of FISTA for multiobjective optimization problems. However, similarly to the single-objective minimization case, the objective functions values may increase in some iterations, and inexac…
▽ More
Accelerated proximal gradient methods, which are also called fast iterative shrinkage-thresholding algorithms (FISTA) are known to be efficient for many applications. Recently, Tanabe et al. proposed an extension of FISTA for multiobjective optimization problems. However, similarly to the single-objective minimization case, the objective functions values may increase in some iterations, and inexact computations of subproblems can also lead to divergence. Motivated by this, here we propose a variant of the FISTA for multiobjective optimization, that imposes some monotonicity of the objective functions values. In the single-objective case, we retrieve the so-called MFISTA, proposed by Beck and Teboulle. We also prove that our method has global convergence with rate $O(1/k^2)$, where $k$ is the number of iterations, and show some numerical advantages in requiring monotonicity.
△ Less
Submitted 1 June, 2023; v1 submitted 9 June, 2022;
originally announced June 2022.
-
A globally convergent fast iterative shrinkage-thresholding algorithm with a new momentum factor for single and multi-objective convex optimization
Authors:
Hiroki Tanabe,
Ellen H. Fukuda,
Nobuo Yamashita
Abstract:
Convex-composite optimization, which minimizes an objective function represented by the sum of a differentiable function and a convex one, is widely used in machine learning and signal/image processing. Fast Iterative Shrinkage Thresholding Algorithm (FISTA) is a typical method for solving this problem and has a global convergence rate of $O(1 / k^2)$. Recently, this has been extended to multi-obj…
▽ More
Convex-composite optimization, which minimizes an objective function represented by the sum of a differentiable function and a convex one, is widely used in machine learning and signal/image processing. Fast Iterative Shrinkage Thresholding Algorithm (FISTA) is a typical method for solving this problem and has a global convergence rate of $O(1 / k^2)$. Recently, this has been extended to multi-objective optimization, together with the proof of the $O(1 / k^2)$ global convergence rate. However, its momentum factor is classical, and the convergence of its iterates has not been proven. In this work, introducing some additional hyperparameters $(a, b)$, we propose another accelerated proximal gradient method with a general momentum factor, which is new even for the single-objective cases. We show that our proposed method also has a global convergence rate of $O(1/k^2)$ for any $(a,b)$, and further that the generated sequence of iterates converges to a weak Pareto solution when $a$ is positive, an essential property for the finite-time manifold identification. Moreover, we report numerical results with various $(a,b)$, showing that some of these choices give better results than the classical momentum factors.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
An accelerated proximal gradient method for multiobjective optimization
Authors:
Hiroki Tanabe,
Ellen H. Fukuda,
Nobuo Yamashita
Abstract:
This paper presents an accelerated proximal gradient method for multiobjective optimization, in which each objective function is the sum of a continuously differentiable, convex function and a closed, proper, convex function. Extending first-order methods for multiobjective problems without scalarization has been widely studied, but providing accelerated methods with accurate proofs of convergence…
▽ More
This paper presents an accelerated proximal gradient method for multiobjective optimization, in which each objective function is the sum of a continuously differentiable, convex function and a closed, proper, convex function. Extending first-order methods for multiobjective problems without scalarization has been widely studied, but providing accelerated methods with accurate proofs of convergence rates remains an open problem. Our proposed method is a multiobjective generalization of the accelerated proximal gradient method, also known as the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA), for scalar optimization. The key to this successful extension is solving a subproblem with terms exclusive to the multiobjective case. This approach allows us to demonstrate the global convergence rate of the proposed method ($O(1 / k^2)$), using a merit function to measure the complexity. Furthermore, we present an efficient way to solve the subproblem via its dual representation, and we confirm the validity of the proposed method through some numerical experiments.
△ Less
Submitted 26 April, 2023; v1 submitted 22 February, 2022;
originally announced February 2022.
-
Distributionally Robust Expected Residual Minimization for Stochastic Variational Inequality Problems
Authors:
Atsushi Hori,
Yuya Yamakawa,
Nobuo Yamashita
Abstract:
The stochastic variational inequality problem (SVIP) is an equilibrium model that includes random variables and has been widely applied in various fields such as economics and engineering. Expected residual minimization (ERM) is an established model for obtaining a reasonable solution for the SVIP, and its objective function is an expected value of a suitable merit function for the SVIP. However,…
▽ More
The stochastic variational inequality problem (SVIP) is an equilibrium model that includes random variables and has been widely applied in various fields such as economics and engineering. Expected residual minimization (ERM) is an established model for obtaining a reasonable solution for the SVIP, and its objective function is an expected value of a suitable merit function for the SVIP. However, the ERM is restricted to the case where the distribution is known in advance. We extend the ERM to ensure the attainment of robust solutions for the SVIP under the uncertainty distribution (the extended ERM is referred to as distributionally robust expected residual minimization (DRERM), where the worst-case distribution is derived from the set of probability measures in which the expected value and variance take the same sample mean and variance, respectively). Under suitable assumptions, we demonstrate that the DRERM can be reformulated as a deterministic convex nonlinear semidefinite programming to avoid numerical integration.
△ Less
Submitted 24 January, 2023; v1 submitted 14 November, 2021;
originally announced November 2021.
-
An equivalent nonlinear optimization model with triangular low-rank factorization for semidefinite programs
Authors:
Yuya Yamakawa,
Tetsuya Ikegami,
Ellen H. Fukuda,
Nobuo Yamashita
Abstract:
In this paper, we propose a new nonlinear optimization model to solve semidefinite optimization problems (SDPs), providing some properties related to local optimal solutions. The proposed model is based on another nonlinear optimization model given by Burer and Monteiro (2003), but it has several nice properties not seen in the existing one. Firstly, the decision variable of the proposed model is…
▽ More
In this paper, we propose a new nonlinear optimization model to solve semidefinite optimization problems (SDPs), providing some properties related to local optimal solutions. The proposed model is based on another nonlinear optimization model given by Burer and Monteiro (2003), but it has several nice properties not seen in the existing one. Firstly, the decision variable of the proposed model is a triangular low-rank matrix, and hence the dimension of its decision variable space is smaller. Secondly, the existence of a strict local optimum of the proposed model is guaranteed under some conditions, whereas the existing model has no strict local optimum. In other words, it is difficult to construct solution methods equipped with fast convergence using the existing model. Some numerical results are also presented to examine the efficiency of the proposed model.
△ Less
Submitted 29 March, 2021;
originally announced March 2021.
-
A Regularized Limited Memory BFGS method for Large-Scale Unconstrained Optimization and its Efficient Implementations
Authors:
Hardik Tankaria,
Shinji Sugimoto,
Nobuo Yamashita
Abstract:
The limited memory BFGS (L-BFGS) method is one of the popular methods for solving large-scale unconstrained optimization. Since the standard L-BFGS method uses a line search to guarantee its global convergence, it sometimes requires a large number of function evaluations. To overcome the difficulty, we propose a new L-BFGS with a certain regularization technique. We show its global convergence und…
▽ More
The limited memory BFGS (L-BFGS) method is one of the popular methods for solving large-scale unconstrained optimization. Since the standard L-BFGS method uses a line search to guarantee its global convergence, it sometimes requires a large number of function evaluations. To overcome the difficulty, we propose a new L-BFGS with a certain regularization technique. We show its global convergence under the usual assumptions. In order to make the method more robust and efficient, we also extend it with several techniques such as nonmonotone technique and simultaneous use of the Wolfe line search. Finally, we present some numerical results for test problems in CUTEst, which show that the proposed method is robust in terms of solving number of problems.
△ Less
Submitted 12 January, 2021;
originally announced January 2021.
-
New merit functions for multiobjective optimization and their properties
Authors:
Hiroki Tanabe,
Ellen H. Fukuda,
Nobuo Yamashita
Abstract:
A merit (gap) function is a map that returns zero at the solutions of problems and strictly positive values otherwise. Its minimization is equivalent to the original problem by definition, and it can estimate the distance between a given point and the solution set. Ideally, this function should have some properties, including the ease of computation, continuity, differentiability, boundedness of t…
▽ More
A merit (gap) function is a map that returns zero at the solutions of problems and strictly positive values otherwise. Its minimization is equivalent to the original problem by definition, and it can estimate the distance between a given point and the solution set. Ideally, this function should have some properties, including the ease of computation, continuity, differentiability, boundedness of the level set, and error boundedness. In this work, we propose new merit functions for multiobjective optimization with lower semicontinuous objectives, convex objectives, and composite objectives, and we show that they have such desirable properties under reasonable assumptions.
△ Less
Submitted 9 April, 2023; v1 submitted 19 October, 2020;
originally announced October 2020.
-
Convergence rates analysis of a multiobjective proximal gradient method
Authors:
Hiroki Tanabe,
Ellen H. Fukuda,
Nobuo Yamashita
Abstract:
Many descent algorithms for multiobjective optimization have been developed in the last two decades. Tanabe et al. (Comput Optim Appl 72(2):339--361, 2019) proposed a proximal gradient method for multiobjective optimization, which can solve multiobjective problems, whose objective function is the sum of a continuously differentiable function and a closed, proper, and convex one. Under reasonable a…
▽ More
Many descent algorithms for multiobjective optimization have been developed in the last two decades. Tanabe et al. (Comput Optim Appl 72(2):339--361, 2019) proposed a proximal gradient method for multiobjective optimization, which can solve multiobjective problems, whose objective function is the sum of a continuously differentiable function and a closed, proper, and convex one. Under reasonable assumptions, it is known that the accumulation points of the sequences generated by this method are Pareto stationary. However, the convergence rates were not established in that paper. Here, we show global convergence rates for the multiobjective proximal gradient method, matching what is known in scalar optimization. More specifically, by using merit functions to measure the complexity, we present the convergence rates for non-convex ($O(\sqrt{1 / k})$), convex ($O(1 / k)$), and strongly convex ($O(r^k)$ for some $r \in (0, 1)$) problems. We also extend the so-called Polyak-Ćojasiewicz (PL) inequality for multiobjective optimization and establish the linear convergence rate for multiobjective problems that satisfy such inequalities ($O(r^k)$ for some $r \in (0, 1)$).
△ Less
Submitted 8 April, 2022; v1 submitted 16 October, 2020;
originally announced October 2020.
-
Alternating Direction Method of Multipliers with Variable Metric Indefinite Proximal Terms for Convex Optimization
Authors:
Yan Gu,
Nobuo Yamashita
Abstract:
This paper studies a proximal alternating direction method of multipliers (ADMM) with variable metric indefinite proximal terms for linearly constrained convex optimization problems. The proximal ADMM plays an important role in many application areas, since the subproblems of the method are easy to solve. Recently, it is reported that the proximal ADMM with a certain fixed indefinite proximal term…
▽ More
This paper studies a proximal alternating direction method of multipliers (ADMM) with variable metric indefinite proximal terms for linearly constrained convex optimization problems. The proximal ADMM plays an important role in many application areas, since the subproblems of the method are easy to solve. Recently, it is reported that the proximal ADMM with a certain fixed indefinite proximal term is faster than that with a positive semidefinite term, and still has the global convergence property. On the other hand, Gu and Yamashita studied a variable metric semidefinite proximal ADMM whose proximal term is generated by the BFGS update. They reported that a slightly indefinite matrix also makes the algorithm work well in their numerical experiments. Motivated by this fact, we consider a variable metric indefinite proximal ADMM, and give sufficient conditions on the proximal terms for the global convergence. Moreover, we propose a new indefinite proximal term based on the BFGS update which can satisfy the conditions for the global convergence.
△ Less
Submitted 28 June, 2019;
originally announced June 2019.
-
An Alternating Direction Method of Multipliers with the BFGS Update for Structured Convex Quadratic Optimization
Authors:
Yan Gu,
Nobuo Yamashita
Abstract:
The alternating direction method of multipliers (ADMM) is an effective method for solving wide fields of convex problems. At each iteration, the classical ADMM solves two subproblems exactly. However, in many applications, it is expensive or impossible to obtain the exact solutions of the subproblems. To overcome the difficulty, some proximal terms are added to the subproblems. This class of metho…
▽ More
The alternating direction method of multipliers (ADMM) is an effective method for solving wide fields of convex problems. At each iteration, the classical ADMM solves two subproblems exactly. However, in many applications, it is expensive or impossible to obtain the exact solutions of the subproblems. To overcome the difficulty, some proximal terms are added to the subproblems. This class of methods normally solves the original subproblem approximately, and thus takes more iterations. This fact urges us to consider that a special proximal term can lead to a better result as the classical ADMM. In this paper, we propose a proximal ADMM whose regularized matrix in the proximal term is generated by the BFGS update (or limited memory BFGS) at every iteration. These types of matrices use second-order information of the objective function. The convergence of the proposed method is proved under certain assumptions. Numerical results are presented to show the effectiveness of the proposed proximal ADMM.
△ Less
Submitted 6 March, 2019;
originally announced March 2019.
-
Duality of nonconvex optimization with positively homogeneous functions
Authors:
Shota Yamanaka,
Nobuo Yamashita
Abstract:
We consider an optimization problem with positively homogeneous functions in its objective and constraint functions. Examples of such positively homogeneous functions include the absolute value function and the $p$-norm function, where $p$ is a positive real number. The problem, which is not necessarily convex, extends the absolute value optimization proposed in [O. L. Mangasarian, Absolute value…
▽ More
We consider an optimization problem with positively homogeneous functions in its objective and constraint functions. Examples of such positively homogeneous functions include the absolute value function and the $p$-norm function, where $p$ is a positive real number. The problem, which is not necessarily convex, extends the absolute value optimization proposed in [O. L. Mangasarian, Absolute value programming, Computational Optimization and Applications 36 (2007) pp. 43-53]. In this work, we propose a dual formulation that, differently from the Lagrangian dual approach, has a closed-form and some interesting properties. In particular, we discuss the relation between the Lagrangian duality and the one proposed here, and give some sufficient conditions under which these dual problems coincide. Finally, we show that some well-known problems, e.g., sum of norms optimization and the group Lasso-type optimization problems, can be reformulated as positively homogeneous optimization problems.
△ Less
Submitted 21 December, 2017;
originally announced December 2017.
-
Duality of optimization problems with gauge functions
Authors:
Shota Yamanaka,
Nobuo Yamashita
Abstract:
Recently, Yamanaka and Yamashita proposed the so-called positively homogeneous optimization problem, which includes many important problems, such as the absolute-value and the gauge optimizations. They presented a closed form of the dual formulation for the problem, and showed weak duality and the equivalence to the Lagrangian dual under some conditions. In this work, we focus on a special positiv…
▽ More
Recently, Yamanaka and Yamashita proposed the so-called positively homogeneous optimization problem, which includes many important problems, such as the absolute-value and the gauge optimizations. They presented a closed form of the dual formulation for the problem, and showed weak duality and the equivalence to the Lagrangian dual under some conditions. In this work, we focus on a special positively homogeneous optimization problem, whose objective function and constraints consist of some gauge and linear functions. We prove not only weak duality but also strong duality. We also study necessary and sufficient optimality conditions associated to the problem. Moreover, we give sufficient conditions under which we can recover a primal solution from a Karush-Kuhn-Tucker point of the dual formulation. Finally, we discuss how to extend the above results to general convex optimization problems by considering the so-called perspective functions.
△ Less
Submitted 25 December, 2020; v1 submitted 13 December, 2017;
originally announced December 2017.