Search | arXiv e-print repository

ReMU: Regional Minimal Updating for Model-Based Derivative-Free Optimization

Abstract: Derivative-free optimization (DFO) problems are optimization problems where derivative information is unavailable or extremely difficult to obtain. Model-based DFO solvers have been applied extensively in scientific computing. Powell's NEWUOA (2004) and Wild's POUNDerS (2014) explore the numerical power of the minimal norm Hessian (MNH) model for DFO and contributed to the open discussion on build… ▽ More Derivative-free optimization (DFO) problems are optimization problems where derivative information is unavailable or extremely difficult to obtain. Model-based DFO solvers have been applied extensively in scientific computing. Powell's NEWUOA (2004) and Wild's POUNDerS (2014) explore the numerical power of the minimal norm Hessian (MNH) model for DFO and contributed to the open discussion on building better models with fewer data to achieve faster numerical convergence. Another decade later, we propose the regional minimal updating (ReMU) models, and extend the previous models into a broader class. This paper shows motivation behind ReMU models, computational details, theoretical and numerical results on particular extreme points and the barycenter of ReMU's weight coefficient region, and the associated KKT matrix error and distance. Novel metrics, such as the truncated Newton step error, are proposed to numerically understand the new models' properties. A new algorithmic strategy, based on iteratively adjusting the ReMU model type, is also proposed, and shows numerical advantages by combining and switching between the barycentric model and the classic least Frobenius norm model in an online fashion. △ Less

Submitted 4 April, 2025; originally announced April 2025.

Comments: 25 pages

arXiv:2404.11893 [pdf, other]

Derivative-Free Optimization via Adaptive Sampling Strategies

Authors: Raghu Bollapragada, Cem Karamanli, Stefan M. Wild

Abstract: In this paper, we present a novel derivative-free optimization framework for solving unconstrained stochastic optimization problems. Many problems in fields ranging from simulation optimization to reinforcement learning involve settings where only stochastic function values are obtained via an oracle with no available gradient information, necessitating the usage of derivative-free optimization me… ▽ More In this paper, we present a novel derivative-free optimization framework for solving unconstrained stochastic optimization problems. Many problems in fields ranging from simulation optimization to reinforcement learning involve settings where only stochastic function values are obtained via an oracle with no available gradient information, necessitating the usage of derivative-free optimization methodologies. Our approach includes estimating gradients using stochastic function evaluations and integrating adaptive sampling techniques to control the accuracy in these stochastic approximations. We consider various gradient estimation techniques including standard finite difference, Gaussian smoothing, sphere smoothing, randomized coordinate finite difference, and randomized subspace finite difference methods. We provide theoretical convergence guarantees for our framework and analyze the worst-case iteration and sample complexities associated with each gradient estimation method. Finally, we demonstrate the empirical performance of the methods on logistic regression and nonlinear least squares problems. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2403.13320 [pdf, other]

Direct search for stochastic optimization in random subspaces with zeroth-, first-, and second-order convergence and expected complexity

Authors: K. J. Dzahini, S. M. Wild

Abstract: The work presented here is motivated by the development of StoDARS, a framework for large-scale stochastic blackbox optimization that not only is both an algorithmic and theoretical extension of the stochastic directional direct-search (SDDS) framework but also extends to noisy objectives a recent framework of direct-search algorithms in reduced spaces (DARS). Unlike SDDS, StoDARS achieves scalabi… ▽ More The work presented here is motivated by the development of StoDARS, a framework for large-scale stochastic blackbox optimization that not only is both an algorithmic and theoretical extension of the stochastic directional direct-search (SDDS) framework but also extends to noisy objectives a recent framework of direct-search algorithms in reduced spaces (DARS). Unlike SDDS, StoDARS achieves scalability by using~$m$ search directions generated in random subspaces defined through the columns of Johnson--Lindenstrauss transforms (JLTs) obtained from Haar-distributed orthogonal matrices. For theoretical needs, the quality of these subspaces and the accuracy of random estimates used by the algorithm are required to hold with sufficiently large, but fixed, probabilities. By leveraging an existing supermartingale-based framework, the expected complexity of StoDARS is proved to be similar to that of SDDS and other stochastic full-space methods up to constants, when the objective function is continuously differentiable. By dropping the latter assumption, the convergence of StoDARS to Clarke stationary points with probability one is established. Moreover, the analysis of the second-order behavior of the mesh adaptive direct-search (MADS) algorithm using a second-order-like extension of the Rademacher's theorem-based definition of the Clarke subdifferential (so-called generalized Hessian) is extended to the StoDARS framework, making it the first in a stochastic direct-search setting, to the best of our knowledge. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: 30 pages

arXiv:2402.15380 [pdf, other]

Extended Fayans energy density functional: optimization and analysis

Authors: Paul-Gerhard Reinhard, Jared O'Neal, Stefan M. Wild, Witold Nazarewicz

Abstract: The Fayans energy density functional (EDF) has been very successful in describing global nuclear properties (binding energies, charge radii, and especially differences of radii) within nuclear density functional theory. In a recent study, supervised machine learning methods were used to calibrate the Fayans EDF. Building on this experience, in this work we explore the effect of adding isovector pa… ▽ More The Fayans energy density functional (EDF) has been very successful in describing global nuclear properties (binding energies, charge radii, and especially differences of radii) within nuclear density functional theory. In a recent study, supervised machine learning methods were used to calibrate the Fayans EDF. Building on this experience, in this work we explore the effect of adding isovector pairing terms, which are responsible for different proton and neutron pairing fields, by comparing a 13D model without the isovector pairing term against the extended 14D model. At the heart of the calibration is a carefully selected heterogeneous dataset of experimental observables representing ground-state properties of spherical even-even nuclei. To quantify the impact of the calibration dataset on model parameters and the importance of the new terms, we carry out advanced sensitivity and correlation analysis on both models. The extension to 14D improves the overall quality of the model by about 30%. The enhanced degrees of freedom of the 14D model reduce correlations between model parameters and enhance sensitivity. △ Less

Submitted 23 February, 2024; originally announced February 2024.

Comments: 29-page article, 1-page notice

arXiv:2304.06881 [pdf, other]

doi 10.1287/ijoc.2023.0250

Designing a Framework for Solving Multiobjective Simulation Optimization Problems

Authors: Tyler H. Chang, Stefan M. Wild

Abstract: Multiobjective simulation optimization (MOSO) problems are optimization problems with multiple conflicting objectives, where evaluation of at least one of the objectives depends on a black-box numerical code or real-world experiment, which we refer to as a simulation. While an extensive body of research is dedicated to developing new algorithms and methods for solving these and related problems, i… ▽ More Multiobjective simulation optimization (MOSO) problems are optimization problems with multiple conflicting objectives, where evaluation of at least one of the objectives depends on a black-box numerical code or real-world experiment, which we refer to as a simulation. While an extensive body of research is dedicated to developing new algorithms and methods for solving these and related problems, it is challenging and time consuming to integrate these techniques into real world production-ready solvers. This is partly due to the diversity and complexity of modern state-of-the-art MOSO algorithms and methods and partly due to the complexity and specificity of many real-world problems and their corresponding computing environments. The complexity of this problem is only compounded when introducing potentially complex and/or domain-specific surrogate modeling techniques, problem formulations, design spaces, and data acquisition functions. This paper carefully surveys the current state-of-the-art in MOSO algorithms, techniques, and solvers; as well as problem types and computational environments where MOSO is commonly applied. We then present several key challenges in the design of a Parallel Multiobjective Simulation Optimization framework (ParMOO) and how they have been addressed. Finally, we provide two case studies demonstrating how customized ParMOO solvers can be quickly built and deployed to solve real-world MOSO problems. △ Less

Submitted 9 January, 2025; v1 submitted 13 April, 2023; originally announced April 2023.

arXiv:2302.09128 [pdf, other]

A Stochastic Quasi-Newton Method in the Absence of Common Random Numbers

Authors: Matt Menickelly, Stefan M. Wild, Miaolan Xie

Abstract: We present a quasi-Newton method for unconstrained stochastic optimization. Most existing literature on this topic assumes a setting of stochastic optimization in which a finite sum of component functions is a reasonable approximation of an expectation, and hence one can design a quasi-Newton method to exploit common random numbers. In contrast, and motivated by problems in variational quantum alg… ▽ More We present a quasi-Newton method for unconstrained stochastic optimization. Most existing literature on this topic assumes a setting of stochastic optimization in which a finite sum of component functions is a reasonable approximation of an expectation, and hence one can design a quasi-Newton method to exploit common random numbers. In contrast, and motivated by problems in variational quantum algorithms, we assume that function values and gradients are available only through inexact probabilistic zeroth- and first-order oracles and no common random numbers can be exploited. Our algorithmic framework -- based on prior work on the SASS algorithm -- is general and does not assume common random numbers. We derive a high-probability tail bound on the iteration complexity of the algorithm for nonconvex and strongly convex functions. We present numerical results demonstrating the empirical benefits of augmenting SASS with our quasi-Newton updating scheme, both on synthetic problems and on real problems in quantum chemistry. △ Less

Submitted 1 September, 2024; v1 submitted 17 February, 2023; originally announced February 2023.

MSC Class: 90C15; 90C53; 90C30; 90C26

arXiv:2212.14858 [pdf, other]

A class of sparse Johnson--Lindenstrauss transforms and analysis of their extreme singular values

Authors: Kwassi Joseph Dzahini, Stefan M. Wild

Abstract: The Johnson--Lindenstrauss (JL) lemma is a powerful tool for dimensionality reduction in modern algorithm design. The lemma states that any set of high-dimensional points in a Euclidean space can be flattened to lower dimensions while approximately preserving pairwise Euclidean distances. Random matrices satisfying this lemma are called JL transforms (JLTs). Inspired by existing $s$-hashing JLTs w… ▽ More The Johnson--Lindenstrauss (JL) lemma is a powerful tool for dimensionality reduction in modern algorithm design. The lemma states that any set of high-dimensional points in a Euclidean space can be flattened to lower dimensions while approximately preserving pairwise Euclidean distances. Random matrices satisfying this lemma are called JL transforms (JLTs). Inspired by existing $s$-hashing JLTs with exactly $s$ nonzero elements on each column, the present work introduces an ensemble of sparse matrices encompassing so-called $s$-hashing-like matrices whose expected number of nonzero elements on each column is~$s$. The independence of the sub-Gaussian entries of these matrices and the knowledge of their exact distribution play an important role in their analyses. Using properties of independent sub-Gaussian random variables, these matrices are demonstrated to be JLTs, and their smallest and largest singular values are estimated non-asymptotically using a technique from geometric functional analysis. As the dimensions of the matrix grow to infinity, these singular values are proved to converge almost surely to fixed quantities (by using the universal Bai--Yin law), and in distribution to the Gaussian orthogonal ensemble (GOE) Tracy--Widom law after proper rescalings. Understanding the behaviors of extreme singular values is important in general because they are often used to define a measure of stability of matrix algorithms. For example, JLTs were recently used in derivative-free optimization algorithmic frameworks to select random subspaces in which are constructed random models or poll directions to achieve scalability, whence estimating their smallest singular value in particular helps determine the dimension of these subspaces. △ Less

Submitted 7 November, 2024; v1 submitted 30 December, 2022; originally announced December 2022.

Comments: 21 pages

arXiv:2207.06452 [pdf, other]

doi 10.1137/22M1524072

Stochastic trust-region algorithm in random subspaces with convergence and expected complexity analyses

Authors: Kwassi Joseph Dzahini, Stefan M. Wild

Abstract: This work proposes a framework for large-scale stochastic derivative-free optimization (DFO) by introducing STARS, a trust-region method based on iterative minimization in random subspaces. This framework is both an algorithmic and theoretical extension of an algorithm for stochastic optimization with random models (STORM). Moreover, STARS achieves scalability by minimizing interpolation models th… ▽ More This work proposes a framework for large-scale stochastic derivative-free optimization (DFO) by introducing STARS, a trust-region method based on iterative minimization in random subspaces. This framework is both an algorithmic and theoretical extension of an algorithm for stochastic optimization with random models (STORM). Moreover, STARS achieves scalability by minimizing interpolation models that approximate the objective in low-dimensional affine subspaces, thus significantly reducing per-iteration costs in terms of function evaluations and yielding strong performance on large-scale stochastic DFO problems. The user-determined dimension of these subspaces, when the latter are defined, for example, by the columns of so-called Johnson--Lindenstrauss transforms, turns out to be independent of the dimension of the problem. For convergence purposes, both a particular quality of the subspace and the accuracies of random function estimates and models are required to hold with sufficiently high, but fixed, probabilities. Using martingale theory under the latter assumptions, an almost sure global convergence of STARS to a first-order stationary point is shown, and the expected number of iterations required to reach a desired first-order accuracy is proved to be similar to that of STORM and other stochastic DFO algorithms, up to constants. △ Less

Submitted 13 July, 2022; originally announced July 2022.

Comments: 26 pages

Journal ref: SIAM Journal on Optimization Vol. 34(3): 2671-2699, 2024

arXiv:2207.06305 [pdf, ps, other]

Stochastic Average Model Methods

Authors: Matt Menickelly, Stefan M. Wild

Abstract: We consider the solution of finite-sum minimization problems, such as those appearing in nonlinear least-squares or general empirical risk minimization problems. We are motivated by problems in which the summand functions are computationally expensive and evaluating all summands on every iteration of an optimization method may be undesirable. We present the idea of stochastic average model (SAM) m… ▽ More We consider the solution of finite-sum minimization problems, such as those appearing in nonlinear least-squares or general empirical risk minimization problems. We are motivated by problems in which the summand functions are computationally expensive and evaluating all summands on every iteration of an optimization method may be undesirable. We present the idea of stochastic average model (SAM) methods, inspired by stochastic average gradient methods. SAM methods sample component functions on each iteration of a trust-region method according to a discrete probability distribution on component functions; the distribution is designed to minimize an upper bound on the variance of the resulting stochastic model. We present promising numerical results concerning an implemented variant extending the derivative-free model-based trust-region solver POUNDERS, which we name SAM-POUNDERS. △ Less

Submitted 20 March, 2024; v1 submitted 13 July, 2022; originally announced July 2022.

arXiv:2205.09627 [pdf, other]

doi 10.1007/s11590-022-01950-1

Modeling Approaches for Addressing Simple Unrelaxable Constraints with Unconstrained Optimization Methods

Authors: Misha Padidar, Jeffrey Larson, Stefan M. Wild

Abstract: We explore novel approaches for solving nonlinear optimization problems with unrelaxable bound constraints, which must be satisfied before the objective function can be evaluated. Our method reformulates the unrelaxable bound-constrained problem as an unconstrained optimization problem that is amenable to existing unconstrained optimization methods. The reformulation relies on a domain warping to… ▽ More We explore novel approaches for solving nonlinear optimization problems with unrelaxable bound constraints, which must be satisfied before the objective function can be evaluated. Our method reformulates the unrelaxable bound-constrained problem as an unconstrained optimization problem that is amenable to existing unconstrained optimization methods. The reformulation relies on a domain warping to form a merit function; the choice of the warping determines the level of exactness with which the unconstrained problem can be used to find solutions to the bound-constrained problem, as well as key properties of the unconstrained formulation such as smoothness. We develop theory when the domain warping is a multioutput sigmoidal warping, and we explore the practical elements of applying unconstrained optimization methods to the formulation. We develop an algorithm that exploits the structure of the sigmoidal warping to guarantee that unconstrained optimization algorithms applied to the merit function will find a stationary point to the desired tolerance. △ Less

Submitted 10 November, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

Comments: 20 pages, 5 figures

arXiv:2109.12213 [pdf, other]

Adaptive Sampling Quasi-Newton Methods for Zeroth-Order Stochastic Optimization

Authors: Raghu Bollapragada, Stefan M. Wild

Abstract: We consider unconstrained stochastic optimization problems with no available gradient information. Such problems arise in settings from derivative-free simulation optimization to reinforcement learning. We propose an adaptive sampling quasi-Newton method where we estimate the gradients of a stochastic function using finite differences within a common random number framework. We develop modified ve… ▽ More We consider unconstrained stochastic optimization problems with no available gradient information. Such problems arise in settings from derivative-free simulation optimization to reinforcement learning. We propose an adaptive sampling quasi-Newton method where we estimate the gradients of a stochastic function using finite differences within a common random number framework. We develop modified versions of a norm test and an inner product quasi-Newton test to control the sample sizes used in the stochastic approximations and provide global convergence results to the neighborhood of the optimal solution. We present numerical experiments on simulation optimization problems to illustrate the performance of the proposed algorithm. When compared with classical zeroth-order stochastic gradient methods, we observe that our strategies of adapting the sample sizes significantly improve performance in terms of the number of stochastic function evaluations required. △ Less

Submitted 24 September, 2021; originally announced September 2021.

arXiv:2108.04774 [pdf, other]

doi 10.1007/s11081-022-09733-4

Derivative-Free Optimization of a Rapid-Cycling Synchrotron

Authors: Jeffrey S. Eldred, Jeffrey Larson, Misha Padidar, Eric Stern, Stefan M. Wild

Abstract: We develop and solve a constrained optimization model to identify an integrable optics rapid-cycling synchrotron lattice design that performs well in several capacities. Our model encodes the design criteria into 78 linear and nonlinear constraints, as well as a single nonsmooth objective, where the objective and some constraints are defined from the output of Synergia, an accelerator simulator. W… ▽ More We develop and solve a constrained optimization model to identify an integrable optics rapid-cycling synchrotron lattice design that performs well in several capacities. Our model encodes the design criteria into 78 linear and nonlinear constraints, as well as a single nonsmooth objective, where the objective and some constraints are defined from the output of Synergia, an accelerator simulator. We detail the difficulties of the 23-dimensional simulation-constrained decision space and establish that the space is nonempty. We use a derivative-free manifold sampling algorithm to account for structured nondifferentiability in the objective function. Our numerical results quantify the dependence of solutions on constraint parameters and the effect of the form of objective function. △ Less

Submitted 10 August, 2021; originally announced August 2021.

Comments: 24 pages, 12 figures

arXiv:2105.09824 [pdf, other]

Lookahead Acquisition Functions for Finite-Horizon Time-Dependent Bayesian Optimization and Application to Quantum Optimal Control

Authors: S. Ashwin Renganathan, Jeffrey Larson, Stefan M. Wild

Abstract: We propose a novel Bayesian method to solve the maximization of a time-dependent expensive-to-evaluate stochastic oracle. We are interested in the decision that maximizes the oracle at a finite time horizon, given a limited budget of noisy evaluations of the oracle that can be performed before the horizon. Our recursive two-step lookahead acquisition function for Bayesian optimization makes nonmyo… ▽ More We propose a novel Bayesian method to solve the maximization of a time-dependent expensive-to-evaluate stochastic oracle. We are interested in the decision that maximizes the oracle at a finite time horizon, given a limited budget of noisy evaluations of the oracle that can be performed before the horizon. Our recursive two-step lookahead acquisition function for Bayesian optimization makes nonmyopic decisions at every stage by maximizing the expected utility at the specified time horizon. Specifically, we propose a generalized two-step lookahead framework with a customizable \emph{value} function that allows users to define the utility. We illustrate how lookahead versions of classic acquisition functions such as the expected improvement, probability of improvement, and upper confidence bound can be obtained with this framework. We demonstrate the utility of our proposed approach on several carefully constructed synthetic cases and a real-world quantum optimal control problem. △ Less

Submitted 20 May, 2021; originally announced May 2021.

Comments: 22 pages, 11 figures

arXiv:2010.05668 [pdf, other]

doi 10.1088/1361-6471/abd009

Optimization and Supervised Machine Learning Methods for Fitting Numerical Physics Models without Derivatives

Authors: Raghu Bollapragada, Matt Menickelly, Witold Nazarewicz, Jared O'Neal, Paul-Gerhard Reinhard, Stefan M. Wild

Abstract: We address the calibration of a computationally expensive nuclear physics model for which derivative information with respect to the fit parameters is not readily available. Of particular interest is the performance of optimization-based training algorithms when dozens, rather than millions or more, of training data are available and when the expense of the model places limitations on the number o… ▽ More We address the calibration of a computationally expensive nuclear physics model for which derivative information with respect to the fit parameters is not readily available. Of particular interest is the performance of optimization-based training algorithms when dozens, rather than millions or more, of training data are available and when the expense of the model places limitations on the number of concurrent model evaluations that can be performed. As a case study, we consider the Fayans energy density functional model, which has characteristics similar to many model fitting and calibration problems in nuclear physics. We analyze hyperparameter tuning considerations and variability associated with stochastic optimization algorithms and illustrate considerations for tuning in different computational settings. △ Less

Submitted 14 December, 2020; v1 submitted 12 October, 2020; originally announced October 2020.

Comments: 25-page article, 9-page supplement, 1-page notice

arXiv:2001.00887 [pdf, other]

Tuning Multigrid Methods with Robust Optimization

Authors: Jed Brown, Yunhui He, Scott MacLachlan, Matt Menickelly, Stefan M. Wild

Abstract: Local Fourier analysis is a useful tool for predicting and analyzing the performance of many efficient algorithms for the solution of discretized PDEs, such as multigrid and domain decomposition methods. The crucial aspect of local Fourier analysis is that it can be used to minimize an estimate of the spectral radius of a stationary iteration, or the condition number of a preconditioned system, in… ▽ More Local Fourier analysis is a useful tool for predicting and analyzing the performance of many efficient algorithms for the solution of discretized PDEs, such as multigrid and domain decomposition methods. The crucial aspect of local Fourier analysis is that it can be used to minimize an estimate of the spectral radius of a stationary iteration, or the condition number of a preconditioned system, in terms of a symbol representation of the algorithm. In practice, this is a "minimax" problem, minimizing with respect to solver parameters the appropriate measure of work, which involves maximizing over the Fourier frequency. Often, several algorithmic parameters may be determined by local Fourier analysis in order to obtain efficient algorithms. Analytical solutions to minimax problems are rarely possible beyond simple problems; the status quo in local Fourier analysis involves grid sampling, which is prohibitively expensive in high dimensions. In this paper, we propose and explore optimization algorithms to solve these problems efficiently. Several examples, with known and unknown analytical solutions, are presented to show the effectiveness of these approaches. △ Less

Submitted 27 July, 2020; v1 submitted 3 January, 2020; originally announced January 2020.

arXiv:1910.13516 [pdf, other]

Adaptive Sampling Quasi-Newton Methods for Derivative-Free Stochastic Optimization

Authors: Raghu Bollapragada, Stefan M. Wild

Abstract: We consider stochastic zero-order optimization problems, which arise in settings from simulation optimization to reinforcement learning. We propose an adaptive sampling quasi-Newton method where we estimate the gradients of a stochastic function using finite differences within a common random number framework. We employ modified versions of a norm test and an inner product quasi-Newton test to con… ▽ More We consider stochastic zero-order optimization problems, which arise in settings from simulation optimization to reinforcement learning. We propose an adaptive sampling quasi-Newton method where we estimate the gradients of a stochastic function using finite differences within a common random number framework. We employ modified versions of a norm test and an inner product quasi-Newton test to control the sample sizes used in the stochastic approximations. We provide preliminary numerical experiments to illustrate potential performance benefits of the proposed method. △ Less

Submitted 29 October, 2019; originally announced October 2019.

Comments: 7 pages, NeurIPS workshop

arXiv:1904.11585 [pdf, other]

doi 10.1017/S0962492919000060

Derivative-free optimization methods

Authors: Jeffrey Larson, Matt Menickelly, Stefan M. Wild

Abstract: In many optimization problems arising from scientific, engineering and artificial intelligence applications, objective and constraint functions are available only as the output of a black-box or simulation oracle that does not provide derivative information. Such settings necessitate the use of methods for derivative-free, or zeroth-order, optimization. We provide a review and perspectives on deve… ▽ More In many optimization problems arising from scientific, engineering and artificial intelligence applications, objective and constraint functions are available only as the output of a black-box or simulation oracle that does not provide derivative information. Such settings necessitate the use of methods for derivative-free, or zeroth-order, optimization. We provide a review and perspectives on developments in these methods, with an emphasis on highlighting recent developments and on unifying treatment of such problems in the non-linear optimization and machine learning literature. We categorize methods based on assumed properties of the black-box functions, as well as features of the methods. We first overview the primary setting of deterministic methods applied to unconstrained, non-convex optimization problems where the objective function is defined by a deterministic black-box oracle. We then discuss developments in randomized methods, methods that assume some additional structure about the objective (including convexity, separability and general non-smooth compositions), methods for problems where the output of the black-box oracle is stochastic, and methods for handling different types of constraints. △ Less

Submitted 25 June, 2019; v1 submitted 25 April, 2019; originally announced April 2019.

Journal ref: Acta Numerica 28 (2019) 287-404

arXiv:1903.11366 [pdf, other]

doi 10.1007/s10898-020-00978-w

A Method for Convex Black-Box Integer Global Optimization

Authors: Jeffrey Larson, Sven Leyffer, Prashant Palkar, Stefan M. Wild

Abstract: We study the problem of minimizing a convex function on a nonempty, finite subset of the integer lattice when the function cannot be evaluated at noninteger points. We propose a new underestimator that does not require access to (sub)gradients of the objective but, rather, uses secant linear functions that interpolate the objective function at previously evaluated points. These linear mappings are… ▽ More We study the problem of minimizing a convex function on a nonempty, finite subset of the integer lattice when the function cannot be evaluated at noninteger points. We propose a new underestimator that does not require access to (sub)gradients of the objective but, rather, uses secant linear functions that interpolate the objective function at previously evaluated points. These linear mappings are shown to underestimate the objective in disconnected portions of the domain. Therefore, the union of these conditional cuts provides a nonconvex underestimator of the objective. We propose an algorithm that alternates between updating the underestimator and evaluating the objective function. We prove that the algorithm converges to a global minimum of the objective function on the feasible set. We present two approaches for representing the underestimator and compare their computational effectiveness. We also compare implementations of our algorithm with existing methods for minimizing functions on a subset of the integer lattice. We discuss the difficulty of this problem class and provide insights into why a computational proof of optimality is challenging even for moderate problem sizes. △ Less

Submitted 7 January, 2020; v1 submitted 27 March, 2019; originally announced March 2019.

arXiv:1902.02027 [pdf, ps, other]

Simultaneous Sensing Error Recovery and Tomographic Inversion Using an Optimization-based Approach

Authors: Anthony P. Austin, Zichao Wendy Di, Sven Leyffer, Stefan M. Wild

Abstract: Tomography can be used to reveal internal properties of a 3D object using any penetrating wave. Advanced tomographic imaging techniques, however, are vulnerable to both systematic and random errors associated with the experimental conditions, which are often beyond the capabilities of the state-of-the-art reconstruction techniques such as regularizations. Because they can lead to reduced spatial r… ▽ More Tomography can be used to reveal internal properties of a 3D object using any penetrating wave. Advanced tomographic imaging techniques, however, are vulnerable to both systematic and random errors associated with the experimental conditions, which are often beyond the capabilities of the state-of-the-art reconstruction techniques such as regularizations. Because they can lead to reduced spatial resolution and even misinterpretation of the underlying sample structures, these errors present a fundamental obstacle to full realization of the capabilities of next-generation physical imaging. In this work, we develop efficient and explicit recovery schemes of the most common experimental error: movement of the center of rotation during the experiment. We formulate new physical models to capture the experimental setup, and we devise new mathematical optimization formulations for reliable inversion of complex samples. We demonstrate and validate the efficacy of our approach on synthetic data under known perturbations of the center of rotation. △ Less

Submitted 6 February, 2019; v1 submitted 6 February, 2019; originally announced February 2019.

MSC Class: 68Q25; 68R10; 68U05

arXiv:1807.02736 [pdf, other]

Robust Learning of Trimmed Estimators via Manifold Sampling

Authors: Matt Menickelly, Stefan M. Wild

Abstract: We adapt a manifold sampling algorithm for the nonsmooth, nonconvex formulations of learning that arise when imposing robustness to outliers present in the training data. We demonstrate the approach on objectives based on trimmed loss. Empirical results show that the method has favorable scaling properties. Although savings in time come at the expense of not certifying optimality, the algorithm co… ▽ More We adapt a manifold sampling algorithm for the nonsmooth, nonconvex formulations of learning that arise when imposing robustness to outliers present in the training data. We demonstrate the approach on objectives based on trimmed loss. Empirical results show that the method has favorable scaling properties. Although savings in time come at the expense of not certifying optimality, the algorithm consistently returns high-quality solutions on the trimmed linear regression and multiclass classification problems tested. △ Less

Submitted 7 July, 2018; originally announced July 2018.

Comments: In ICML 2018 Workshop on Modern Trends in Nonconvex Optimization for Machine Learning

arXiv:1505.07881 [pdf, other]

A Taxonomy of Constraints in Simulation-Based Optimization

Authors: Sébastien Le Digabel, Stefan M. Wild

Abstract: The types of constraints encountered in black-box and simulation-based optimization problems differ significantly from those treated in nonlinear programming. We introduce a characterization of constraints to address this situation. We provide formal definitions for several constraint classes and present illustrative examples in the context of the resulting taxonomy. This taxonomy, denoted QRAK, i… ▽ More The types of constraints encountered in black-box and simulation-based optimization problems differ significantly from those treated in nonlinear programming. We introduce a characterization of constraints to address this situation. We provide formal definitions for several constraint classes and present illustrative examples in the context of the resulting taxonomy. This taxonomy, denoted QRAK, is useful for modeling and problem formulation, as well as optimization software development and deployment. It can also be used as the basis for a dialog with practitioners in moving problems to increasingly solvable branches of optimization. △ Less

Submitted 28 May, 2015; originally announced May 2015.

Showing 1–21 of 21 results for author: Wild, S M