-
Direct search for stochastic optimization in random subspaces with zeroth-, first-, and second-order convergence and expected complexity
Authors:
K. J. Dzahini,
S. M. Wild
Abstract:
The work presented here is motivated by the development of StoDARS, a framework for large-scale stochastic blackbox optimization that not only is both an algorithmic and theoretical extension of the stochastic directional direct-search (SDDS) framework but also extends to noisy objectives a recent framework of direct-search algorithms in reduced spaces (DARS). Unlike SDDS, StoDARS achieves scalabi…
▽ More
The work presented here is motivated by the development of StoDARS, a framework for large-scale stochastic blackbox optimization that not only is both an algorithmic and theoretical extension of the stochastic directional direct-search (SDDS) framework but also extends to noisy objectives a recent framework of direct-search algorithms in reduced spaces (DARS). Unlike SDDS, StoDARS achieves scalability by using~$m$ search directions generated in random subspaces defined through the columns of Johnson--Lindenstrauss transforms (JLTs) obtained from Haar-distributed orthogonal matrices. For theoretical needs, the quality of these subspaces and the accuracy of random estimates used by the algorithm are required to hold with sufficiently large, but fixed, probabilities. By leveraging an existing supermartingale-based framework, the expected complexity of StoDARS is proved to be similar to that of SDDS and other stochastic full-space methods up to constants, when the objective function is continuously differentiable. By dropping the latter assumption, the convergence of StoDARS to Clarke stationary points with probability one is established. Moreover, the analysis of the second-order behavior of the mesh adaptive direct-search (MADS) algorithm using a second-order-like extension of the Rademacher's theorem-based definition of the Clarke subdifferential (so-called generalized Hessian) is extended to the StoDARS framework, making it the first in a stochastic direct-search setting, to the best of our knowledge.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Direct-search methods in the year 2025: Theoretical guarantees and algorithmic paradigms
Authors:
K. J. Dzahini,
F. Rinaldi,
C. W. Royer,
D. Zeffiro
Abstract:
Optimizing a function without using derivatives is a challenging paradigm, that precludes from using classical algorithms from nonlinear optimization, and may thus seem intractable other than by using heuristics. Nevertheless, the field of derivative-free optimization has succeeded in producing algorithms that do not rely on derivatives and yet are endowed with convergence guarantees. One class of…
▽ More
Optimizing a function without using derivatives is a challenging paradigm, that precludes from using classical algorithms from nonlinear optimization, and may thus seem intractable other than by using heuristics. Nevertheless, the field of derivative-free optimization has succeeded in producing algorithms that do not rely on derivatives and yet are endowed with convergence guarantees. One class of such methods, called direct-search methods, is particularly popular thanks to its simplicity of implementation, even though its theoretical underpinnings are not always easy to grasp.
In this work, we survey contemporary direct-search algorithms from a theoretical viewpoint, with the aim of highlighting the key theoretical features of these methods. \rev{We provide a basic introduction to the main classes of direct-search methods, including line-search techniques that have received little attention in earlier surveys. We also put a particular emphasis on probabilistic direct-search techniques and their application to noisy problems, a topic that has undergone significant algorithmic development in recent years. Finally, we complement existing surveys by reviewing the main theoretical advances for solving constrained and multiobjective optimization using direct-search algorithms.
△ Less
Submitted 5 June, 2025; v1 submitted 8 March, 2024;
originally announced March 2024.
-
A class of sparse Johnson--Lindenstrauss transforms and analysis of their extreme singular values
Authors:
Kwassi Joseph Dzahini,
Stefan M. Wild
Abstract:
The Johnson--Lindenstrauss (JL) lemma is a powerful tool for dimensionality reduction in modern algorithm design. The lemma states that any set of high-dimensional points in a Euclidean space can be flattened to lower dimensions while approximately preserving pairwise Euclidean distances. Random matrices satisfying this lemma are called JL transforms (JLTs). Inspired by existing $s$-hashing JLTs w…
▽ More
The Johnson--Lindenstrauss (JL) lemma is a powerful tool for dimensionality reduction in modern algorithm design. The lemma states that any set of high-dimensional points in a Euclidean space can be flattened to lower dimensions while approximately preserving pairwise Euclidean distances. Random matrices satisfying this lemma are called JL transforms (JLTs). Inspired by existing $s$-hashing JLTs with exactly $s$ nonzero elements on each column, the present work introduces an ensemble of sparse matrices encompassing so-called $s$-hashing-like matrices whose expected number of nonzero elements on each column is~$s$. The independence of the sub-Gaussian entries of these matrices and the knowledge of their exact distribution play an important role in their analyses. Using properties of independent sub-Gaussian random variables, these matrices are demonstrated to be JLTs, and their smallest and largest singular values are estimated non-asymptotically using a technique from geometric functional analysis. As the dimensions of the matrix grow to infinity, these singular values are proved to converge almost surely to fixed quantities (by using the universal Bai--Yin law), and in distribution to the Gaussian orthogonal ensemble (GOE) Tracy--Widom law after proper rescalings. Understanding the behaviors of extreme singular values is important in general because they are often used to define a measure of stability of matrix algorithms. For example, JLTs were recently used in derivative-free optimization algorithmic frameworks to select random subspaces in which are constructed random models or poll directions to achieve scalability, whence estimating their smallest singular value in particular helps determine the dimension of these subspaces.
△ Less
Submitted 7 November, 2024; v1 submitted 30 December, 2022;
originally announced December 2022.
-
Stochastic trust-region algorithm in random subspaces with convergence and expected complexity analyses
Authors:
Kwassi Joseph Dzahini,
Stefan M. Wild
Abstract:
This work proposes a framework for large-scale stochastic derivative-free optimization (DFO) by introducing STARS, a trust-region method based on iterative minimization in random subspaces. This framework is both an algorithmic and theoretical extension of an algorithm for stochastic optimization with random models (STORM). Moreover, STARS achieves scalability by minimizing interpolation models th…
▽ More
This work proposes a framework for large-scale stochastic derivative-free optimization (DFO) by introducing STARS, a trust-region method based on iterative minimization in random subspaces. This framework is both an algorithmic and theoretical extension of an algorithm for stochastic optimization with random models (STORM). Moreover, STARS achieves scalability by minimizing interpolation models that approximate the objective in low-dimensional affine subspaces, thus significantly reducing per-iteration costs in terms of function evaluations and yielding strong performance on large-scale stochastic DFO problems. The user-determined dimension of these subspaces, when the latter are defined, for example, by the columns of so-called Johnson--Lindenstrauss transforms, turns out to be independent of the dimension of the problem. For convergence purposes, both a particular quality of the subspace and the accuracies of random function estimates and models are required to hold with sufficiently high, but fixed, probabilities. Using martingale theory under the latter assumptions, an almost sure global convergence of STARS to a first-order stationary point is shown, and the expected number of iterations required to reach a desired first-order accuracy is proved to be similar to that of STORM and other stochastic DFO algorithms, up to constants.
△ Less
Submitted 13 July, 2022;
originally announced July 2022.
-
Constrained stochastic blackbox optimization using a progressive barrier and probabilistic estimates
Authors:
Kwassi Joseph Dzahini,
Michael Kokkolaras,
Sébastien Le Digabel
Abstract:
This work introduces the StoMADS-PB algorithm for constrained stochastic blackbox optimization, which is an extension of the mesh adaptive direct-search (MADS) method originally developed for deterministic blackbox optimization under general constraints. The values of the objective and constraint functions are provided by a noisy blackbox, i.e., they can only be computed with random noise whose di…
▽ More
This work introduces the StoMADS-PB algorithm for constrained stochastic blackbox optimization, which is an extension of the mesh adaptive direct-search (MADS) method originally developed for deterministic blackbox optimization under general constraints. The values of the objective and constraint functions are provided by a noisy blackbox, i.e., they can only be computed with random noise whose distribution is unknown. As in MADS, constraint violations are aggregated into a single constraint violation function. Since all functions values are numerically unavailable, StoMADS-PB uses estimates and introduces so-called probabilistic bounds for the violation. Such estimates and bounds obtained from stochastic observations are required to be accurate and reliable with high but fixed probabilities. The proposed method, which allows intermediate infeasible iterates, accepts new points using sufficient decrease conditions and imposing a threshold on the probabilistic bounds. Using Clarke nonsmooth calculus and martingale theory, Clarke stationarity convergence results for the objective and the violation function are derived with probability one.
△ Less
Submitted 9 November, 2020;
originally announced November 2020.
-
Expected complexity analysis of stochastic direct-search
Authors:
Kwassi Joseph Dzahini
Abstract:
This work presents the convergence rate analysis of stochastic variants of the broad class of direct-search methods of directional type. It introduces an algorithm designed to optimize differentiable objective functions $f$ whose values can only be computed through a stochastically noisy blackbox. The proposed stochastic directional direct-search (SDDS) algorithm accepts new iterates by imposing a…
▽ More
This work presents the convergence rate analysis of stochastic variants of the broad class of direct-search methods of directional type. It introduces an algorithm designed to optimize differentiable objective functions $f$ whose values can only be computed through a stochastically noisy blackbox. The proposed stochastic directional direct-search (SDDS) algorithm accepts new iterates by imposing a sufficient decrease condition on so called probabilistic estimates of the corresponding unavailable objective function values. The accuracy of such estimates is required to hold with a sufficiently large but fixed probability $β$. The analysis of this method utilizes an existing supermartingale-based framework proposed for the convergence rates analysis of stochastic optimization methods that use adaptive step sizes. It aims to show that the expected number of iterations required to drive the norm of the gradient of $f$ below a given threshold $ε$ is bounded in $\mathcal{O}\left(ε^{\frac{-p}{\min(p-1,1)}}/(2β-1)\right)$ with $p>1$. Unlike prior analysis using the same aforementioned framework such as those of stochastic trust-region methods and stochastic line search methods, SDDS does not use any gradient information to find descent directions. However, its convergence rate is similar to those of both latter methods with a dependence on $ε$ that also matches that of the broad class of deterministic directional direct-search methods which accept new iterates by imposing a sufficient decrease condition.
△ Less
Submitted 6 March, 2020;
originally announced March 2020.
-
StoMADS: Stochastic blackbox optimization using probabilistic estimates
Authors:
Charles Audet,
Kwassi Joseph Dzahini,
Michael Kokkolaras,
Sébastien Le Digabel
Abstract:
This work introduces StoMADS, a stochastic variant of the mesh adaptive direct-search (MADS) algorithm originally developed for deterministic blackbox optimization. StoMADS considers the unconstrained optimization of an objective function f whose values can be computed only through a blackbox corrupted by some random noise following an unknown distribution. The proposed method is based on an algor…
▽ More
This work introduces StoMADS, a stochastic variant of the mesh adaptive direct-search (MADS) algorithm originally developed for deterministic blackbox optimization. StoMADS considers the unconstrained optimization of an objective function f whose values can be computed only through a blackbox corrupted by some random noise following an unknown distribution. The proposed method is based on an algorithmic framework similar to that of MADS and uses random estimates of function values obtained from stochastic observations since the exact deterministic computable version of f is not available. Such estimates are required to be accurate with a sufficiently large but fixed probability and satisfy a variance condition. The ability of the proposed algorithm to generate an asymptotically dense set of search directions is then exploited to show convergence to a Clarke stationary point of f with probability one, using martingale theory.
△ Less
Submitted 3 November, 2019;
originally announced November 2019.