Search | arXiv e-print repository

arXiv:2403.13320 [pdf, other]

Direct search for stochastic optimization in random subspaces with zeroth-, first-, and second-order convergence and expected complexity

Authors: K. J. Dzahini, S. M. Wild

Abstract: The work presented here is motivated by the development of StoDARS, a framework for large-scale stochastic blackbox optimization that not only is both an algorithmic and theoretical extension of the stochastic directional direct-search (SDDS) framework but also extends to noisy objectives a recent framework of direct-search algorithms in reduced spaces (DARS). Unlike SDDS, StoDARS achieves scalabi… ▽ More The work presented here is motivated by the development of StoDARS, a framework for large-scale stochastic blackbox optimization that not only is both an algorithmic and theoretical extension of the stochastic directional direct-search (SDDS) framework but also extends to noisy objectives a recent framework of direct-search algorithms in reduced spaces (DARS). Unlike SDDS, StoDARS achieves scalability by using~$m$ search directions generated in random subspaces defined through the columns of Johnson--Lindenstrauss transforms (JLTs) obtained from Haar-distributed orthogonal matrices. For theoretical needs, the quality of these subspaces and the accuracy of random estimates used by the algorithm are required to hold with sufficiently large, but fixed, probabilities. By leveraging an existing supermartingale-based framework, the expected complexity of StoDARS is proved to be similar to that of SDDS and other stochastic full-space methods up to constants, when the objective function is continuously differentiable. By dropping the latter assumption, the convergence of StoDARS to Clarke stationary points with probability one is established. Moreover, the analysis of the second-order behavior of the mesh adaptive direct-search (MADS) algorithm using a second-order-like extension of the Rademacher's theorem-based definition of the Clarke subdifferential (so-called generalized Hessian) is extended to the StoDARS framework, making it the first in a stochastic direct-search setting, to the best of our knowledge. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: 30 pages

arXiv:2403.05322 [pdf, ps, other]

Direct-search methods in the year 2025: Theoretical guarantees and algorithmic paradigms

Authors: K. J. Dzahini, F. Rinaldi, C. W. Royer, D. Zeffiro

Abstract: Optimizing a function without using derivatives is a challenging paradigm, that precludes from using classical algorithms from nonlinear optimization, and may thus seem intractable other than by using heuristics. Nevertheless, the field of derivative-free optimization has succeeded in producing algorithms that do not rely on derivatives and yet are endowed with convergence guarantees. One class of… ▽ More Optimizing a function without using derivatives is a challenging paradigm, that precludes from using classical algorithms from nonlinear optimization, and may thus seem intractable other than by using heuristics. Nevertheless, the field of derivative-free optimization has succeeded in producing algorithms that do not rely on derivatives and yet are endowed with convergence guarantees. One class of such methods, called direct-search methods, is particularly popular thanks to its simplicity of implementation, even though its theoretical underpinnings are not always easy to grasp. In this work, we survey contemporary direct-search algorithms from a theoretical viewpoint, with the aim of highlighting the key theoretical features of these methods. \rev{We provide a basic introduction to the main classes of direct-search methods, including line-search techniques that have received little attention in earlier surveys. We also put a particular emphasis on probabilistic direct-search techniques and their application to noisy problems, a topic that has undergone significant algorithmic development in recent years. Finally, we complement existing surveys by reviewing the main theoretical advances for solving constrained and multiobjective optimization using direct-search algorithms. △ Less

Submitted 5 June, 2025; v1 submitted 8 March, 2024; originally announced March 2024.

Comments: Version 2 significantly revised with new material and title change

arXiv:2212.14858 [pdf, other]

A class of sparse Johnson--Lindenstrauss transforms and analysis of their extreme singular values

Authors: Kwassi Joseph Dzahini, Stefan M. Wild

Abstract: The Johnson--Lindenstrauss (JL) lemma is a powerful tool for dimensionality reduction in modern algorithm design. The lemma states that any set of high-dimensional points in a Euclidean space can be flattened to lower dimensions while approximately preserving pairwise Euclidean distances. Random matrices satisfying this lemma are called JL transforms (JLTs). Inspired by existing $s$-hashing JLTs w… ▽ More The Johnson--Lindenstrauss (JL) lemma is a powerful tool for dimensionality reduction in modern algorithm design. The lemma states that any set of high-dimensional points in a Euclidean space can be flattened to lower dimensions while approximately preserving pairwise Euclidean distances. Random matrices satisfying this lemma are called JL transforms (JLTs). Inspired by existing $s$-hashing JLTs with exactly $s$ nonzero elements on each column, the present work introduces an ensemble of sparse matrices encompassing so-called $s$-hashing-like matrices whose expected number of nonzero elements on each column is~$s$. The independence of the sub-Gaussian entries of these matrices and the knowledge of their exact distribution play an important role in their analyses. Using properties of independent sub-Gaussian random variables, these matrices are demonstrated to be JLTs, and their smallest and largest singular values are estimated non-asymptotically using a technique from geometric functional analysis. As the dimensions of the matrix grow to infinity, these singular values are proved to converge almost surely to fixed quantities (by using the universal Bai--Yin law), and in distribution to the Gaussian orthogonal ensemble (GOE) Tracy--Widom law after proper rescalings. Understanding the behaviors of extreme singular values is important in general because they are often used to define a measure of stability of matrix algorithms. For example, JLTs were recently used in derivative-free optimization algorithmic frameworks to select random subspaces in which are constructed random models or poll directions to achieve scalability, whence estimating their smallest singular value in particular helps determine the dimension of these subspaces. △ Less

Submitted 7 November, 2024; v1 submitted 30 December, 2022; originally announced December 2022.

Comments: 21 pages

arXiv:2207.06452 [pdf, other]

doi 10.1137/22M1524072

Stochastic trust-region algorithm in random subspaces with convergence and expected complexity analyses

Authors: Kwassi Joseph Dzahini, Stefan M. Wild

Abstract: This work proposes a framework for large-scale stochastic derivative-free optimization (DFO) by introducing STARS, a trust-region method based on iterative minimization in random subspaces. This framework is both an algorithmic and theoretical extension of an algorithm for stochastic optimization with random models (STORM). Moreover, STARS achieves scalability by minimizing interpolation models th… ▽ More This work proposes a framework for large-scale stochastic derivative-free optimization (DFO) by introducing STARS, a trust-region method based on iterative minimization in random subspaces. This framework is both an algorithmic and theoretical extension of an algorithm for stochastic optimization with random models (STORM). Moreover, STARS achieves scalability by minimizing interpolation models that approximate the objective in low-dimensional affine subspaces, thus significantly reducing per-iteration costs in terms of function evaluations and yielding strong performance on large-scale stochastic DFO problems. The user-determined dimension of these subspaces, when the latter are defined, for example, by the columns of so-called Johnson--Lindenstrauss transforms, turns out to be independent of the dimension of the problem. For convergence purposes, both a particular quality of the subspace and the accuracies of random function estimates and models are required to hold with sufficiently high, but fixed, probabilities. Using martingale theory under the latter assumptions, an almost sure global convergence of STARS to a first-order stationary point is shown, and the expected number of iterations required to reach a desired first-order accuracy is proved to be similar to that of STORM and other stochastic DFO algorithms, up to constants. △ Less

Submitted 13 July, 2022; originally announced July 2022.

Comments: 26 pages

Journal ref: SIAM Journal on Optimization Vol. 34(3): 2671-2699, 2024

arXiv:2011.04225 [pdf, other]

doi 10.1007/s10107-022-01787-7

Constrained stochastic blackbox optimization using a progressive barrier and probabilistic estimates

Authors: Kwassi Joseph Dzahini, Michael Kokkolaras, Sébastien Le Digabel

Abstract: This work introduces the StoMADS-PB algorithm for constrained stochastic blackbox optimization, which is an extension of the mesh adaptive direct-search (MADS) method originally developed for deterministic blackbox optimization under general constraints. The values of the objective and constraint functions are provided by a noisy blackbox, i.e., they can only be computed with random noise whose di… ▽ More This work introduces the StoMADS-PB algorithm for constrained stochastic blackbox optimization, which is an extension of the mesh adaptive direct-search (MADS) method originally developed for deterministic blackbox optimization under general constraints. The values of the objective and constraint functions are provided by a noisy blackbox, i.e., they can only be computed with random noise whose distribution is unknown. As in MADS, constraint violations are aggregated into a single constraint violation function. Since all functions values are numerically unavailable, StoMADS-PB uses estimates and introduces so-called probabilistic bounds for the violation. Such estimates and bounds obtained from stochastic observations are required to be accurate and reliable with high but fixed probabilities. The proposed method, which allows intermediate infeasible iterates, accepts new points using sufficient decrease conditions and imposing a threshold on the probabilistic bounds. Using Clarke nonsmooth calculus and martingale theory, Clarke stationarity convergence results for the objective and the violation function are derived with probability one. △ Less

Submitted 9 November, 2020; originally announced November 2020.

arXiv:2003.03066 [pdf, ps, other]

Expected complexity analysis of stochastic direct-search

Authors: Kwassi Joseph Dzahini

Abstract: This work presents the convergence rate analysis of stochastic variants of the broad class of direct-search methods of directional type. It introduces an algorithm designed to optimize differentiable objective functions $f$ whose values can only be computed through a stochastically noisy blackbox. The proposed stochastic directional direct-search (SDDS) algorithm accepts new iterates by imposing a… ▽ More This work presents the convergence rate analysis of stochastic variants of the broad class of direct-search methods of directional type. It introduces an algorithm designed to optimize differentiable objective functions $f$ whose values can only be computed through a stochastically noisy blackbox. The proposed stochastic directional direct-search (SDDS) algorithm accepts new iterates by imposing a sufficient decrease condition on so called probabilistic estimates of the corresponding unavailable objective function values. The accuracy of such estimates is required to hold with a sufficiently large but fixed probability $β$. The analysis of this method utilizes an existing supermartingale-based framework proposed for the convergence rates analysis of stochastic optimization methods that use adaptive step sizes. It aims to show that the expected number of iterations required to drive the norm of the gradient of $f$ below a given threshold $ε$ is bounded in $\mathcal{O}\left(ε^{\frac{-p}{\min(p-1,1)}}/(2β-1)\right)$ with $p>1$. Unlike prior analysis using the same aforementioned framework such as those of stochastic trust-region methods and stochastic line search methods, SDDS does not use any gradient information to find descent directions. However, its convergence rate is similar to those of both latter methods with a dependence on $ε$ that also matches that of the broad class of deterministic directional direct-search methods which accept new iterates by imposing a sufficient decrease condition. △ Less

Submitted 6 March, 2020; originally announced March 2020.

arXiv:1911.01012 [pdf, other]

StoMADS: Stochastic blackbox optimization using probabilistic estimates

Authors: Charles Audet, Kwassi Joseph Dzahini, Michael Kokkolaras, Sébastien Le Digabel

Abstract: This work introduces StoMADS, a stochastic variant of the mesh adaptive direct-search (MADS) algorithm originally developed for deterministic blackbox optimization. StoMADS considers the unconstrained optimization of an objective function f whose values can be computed only through a blackbox corrupted by some random noise following an unknown distribution. The proposed method is based on an algor… ▽ More This work introduces StoMADS, a stochastic variant of the mesh adaptive direct-search (MADS) algorithm originally developed for deterministic blackbox optimization. StoMADS considers the unconstrained optimization of an objective function f whose values can be computed only through a blackbox corrupted by some random noise following an unknown distribution. The proposed method is based on an algorithmic framework similar to that of MADS and uses random estimates of function values obtained from stochastic observations since the exact deterministic computable version of f is not available. Such estimates are required to be accurate with a sufficiently large but fixed probability and satisfy a variance condition. The ability of the proposed algorithm to generate an asymptotically dense set of search directions is then exploited to show convergence to a Clarke stationary point of f with probability one, using martingale theory. △ Less

Submitted 3 November, 2019; originally announced November 2019.

Showing 1–7 of 7 results for author: Dzahini, K J