Search | arXiv e-print repository

Bayesian Optimization with Preference Exploration by Monotonic Neural Network Ensemble

Authors: Hanyang Wang, Juergen Branke, Matthias Poloczek

Abstract: Many real-world black-box optimization problems have multiple conflicting objectives. Rather than attempting to approximate the entire set of Pareto-optimal solutions, interactive preference learning allows to focus the search on the most relevant subset. However, few previous studies have exploited the fact that utility functions are usually monotonic. In this paper, we address the Bayesian Optim… ▽ More Many real-world black-box optimization problems have multiple conflicting objectives. Rather than attempting to approximate the entire set of Pareto-optimal solutions, interactive preference learning allows to focus the search on the most relevant subset. However, few previous studies have exploited the fact that utility functions are usually monotonic. In this paper, we address the Bayesian Optimization with Preference Exploration (BOPE) problem and propose using a neural network ensemble as a utility surrogate model. This approach naturally integrates monotonicity and supports pairwise comparison data. Our experiments demonstrate that the proposed method outperforms state-of-the-art approaches and exhibits robustness to noise in utility evaluations. An ablation study highlights the critical role of monotonicity in enhancing performance. △ Less

Submitted 30 January, 2025; originally announced January 2025.

arXiv:2304.11468 [pdf, other]

Increasing the Scope as You Learn: Adaptive Bayesian Optimization in Nested Subspaces

Authors: Leonard Papenmeier, Luigi Nardi, Matthias Poloczek

Abstract: Recent advances have extended the scope of Bayesian optimization (BO) to expensive-to-evaluate black-box functions with dozens of dimensions, aspiring to unlock impactful applications, for example, in the life sciences, neural architecture search, and robotics. However, a closer examination reveals that the state-of-the-art methods for high-dimensional Bayesian optimization (HDBO) suffer from degr… ▽ More Recent advances have extended the scope of Bayesian optimization (BO) to expensive-to-evaluate black-box functions with dozens of dimensions, aspiring to unlock impactful applications, for example, in the life sciences, neural architecture search, and robotics. However, a closer examination reveals that the state-of-the-art methods for high-dimensional Bayesian optimization (HDBO) suffer from degrading performance as the number of dimensions increases or even risk failure if certain unverifiable assumptions are not met. This paper proposes BAxUS that leverages a novel family of nested random subspaces to adapt the space it optimizes over to the problem. This ensures high performance while removing the risk of failure, which we assert via theoretical guarantees. A comprehensive evaluation demonstrates that BAxUS achieves better results than the state-of-the-art methods for a broad set of applications. △ Less

Submitted 22 April, 2023; originally announced April 2023.

Comments: 28 pages, 8 figures. Accepted to NeurIPS 2022. This is the revised version and includes the appendix

Journal ref: Advances in Neural Information Processing Systems 35 (NeurIPS 2022)

arXiv:2002.08526 [pdf, other]

Scalable Constrained Bayesian Optimization

Authors: David Eriksson, Matthias Poloczek

Abstract: The global optimization of a high-dimensional black-box function under black-box constraints is a pervasive task in machine learning, control, and engineering. These problems are challenging since the feasible set is typically non-convex and hard to find, in addition to the curses of dimensionality and the heterogeneity of the underlying functions. In particular, these characteristics dramatically… ▽ More The global optimization of a high-dimensional black-box function under black-box constraints is a pervasive task in machine learning, control, and engineering. These problems are challenging since the feasible set is typically non-convex and hard to find, in addition to the curses of dimensionality and the heterogeneity of the underlying functions. In particular, these characteristics dramatically impact the performance of Bayesian optimization methods, that otherwise have become the de facto standard for sample-efficient optimization in unconstrained settings, leaving practitioners with evolutionary strategies or heuristics. We propose the scalable constrained Bayesian optimization (SCBO) algorithm that overcomes the above challenges and pushes the applicability of Bayesian optimization far beyond the state-of-the-art. A comprehensive experimental evaluation demonstrates that SCBO achieves excellent results on a variety of benchmarks. To this end, we propose two new control problems that we expect to be of independent value for the scientific community. △ Less

Submitted 28 February, 2021; v1 submitted 19 February, 2020; originally announced February 2020.

Comments: To appear in Proceedings of AISTATS 2021

arXiv:1910.09259 [pdf, other]

Bayesian Optimization Allowing for Common Random Numbers

Authors: Michael Pearce, Matthias Poloczek, Juergen Branke

Abstract: Bayesian optimization is a powerful tool for expensive stochastic black-box optimization problems such as simulation-based optimization or machine learning hyperparameter tuning. Many stochastic objective functions implicitly require a random number seed as input. By explicitly reusing a seed a user can exploit common random numbers, comparing two or more inputs under the same randomly generated s… ▽ More Bayesian optimization is a powerful tool for expensive stochastic black-box optimization problems such as simulation-based optimization or machine learning hyperparameter tuning. Many stochastic objective functions implicitly require a random number seed as input. By explicitly reusing a seed a user can exploit common random numbers, comparing two or more inputs under the same randomly generated scenario, such as a common customer stream in a job shop problem, or the same random partition of training data into training and validation set for a machine learning algorithm. With the aim of finding an input with the best average performance over infinitely many seeds, we propose a novel Gaussian process model that jointly models both the output for each seed and the average. We then introduce the Knowledge gradient for Common Random Numbers that iteratively determines a combination of input and random seed to evaluate the objective and automatically trades off reusing old seeds and querying new seeds, thus overcoming the need to evaluate inputs in batches or measuring differences of pairs as suggested in previous methods. We investigate the Knowledge Gradient for Common Random Numbers both theoretically and empirically, finding it achieves significant performance improvements with only moderate added computational cost. △ Less

Submitted 21 October, 2019; originally announced October 2019.

Comments: 21 pages + Appendix, under review

arXiv:1910.01739 [pdf, other]

Scalable Global Optimization via Local Bayesian Optimization

Authors: David Eriksson, Michael Pearce, Jacob R Gardner, Ryan Turner, Matthias Poloczek

Abstract: Bayesian optimization has recently emerged as a popular method for the sample-efficient optimization of expensive black-box functions. However, the application to high-dimensional problems with several thousand observations remains challenging, and on difficult problems Bayesian optimization is often not competitive with other paradigms. In this paper we take the view that this is due to the impli… ▽ More Bayesian optimization has recently emerged as a popular method for the sample-efficient optimization of expensive black-box functions. However, the application to high-dimensional problems with several thousand observations remains challenging, and on difficult problems Bayesian optimization is often not competitive with other paradigms. In this paper we take the view that this is due to the implicit homogeneity of the global probabilistic models and an overemphasized exploration that results from global acquisition. This motivates the design of a local probabilistic approach for global optimization of large-scale high-dimensional problems. We propose the $\texttt{TuRBO}$ algorithm that fits a collection of local models and performs a principled global allocation of samples across these models via an implicit bandit approach. A comprehensive evaluation demonstrates that $\texttt{TuRBO}$ outperforms state-of-the-art methods from machine learning and operations research on problems spanning reinforcement learning, robotics, and the natural sciences. △ Less

Submitted 24 February, 2020; v1 submitted 3 October, 2019; originally announced October 2019.

Comments: Appears in NeurIPS 2019 as a spotlight paper

Journal ref: In Advances in Neural Information Processing Systems 32, pages 5497-5508. 2019

arXiv:1806.08838 [pdf, other]

Bayesian Optimization of Combinatorial Structures

Authors: Ricardo Baptista, Matthias Poloczek

Abstract: The optimization of expensive-to-evaluate black-box functions over combinatorial structures is an ubiquitous task in machine learning, engineering and the natural sciences. The combinatorial explosion of the search space and costly evaluations pose challenges for current techniques in discrete optimization and machine learning, and critically require new algorithmic ideas. This article proposes, t… ▽ More The optimization of expensive-to-evaluate black-box functions over combinatorial structures is an ubiquitous task in machine learning, engineering and the natural sciences. The combinatorial explosion of the search space and costly evaluations pose challenges for current techniques in discrete optimization and machine learning, and critically require new algorithmic ideas. This article proposes, to the best of our knowledge, the first algorithm to overcome these challenges, based on an adaptive, scalable model that identifies useful combinatorial structure even when data is scarce. Our acquisition function pioneers the use of semidefinite programming to achieve efficiency and scalability. Experimental evaluations demonstrate that this algorithm consistently outperforms other methods from combinatorial and Bayesian optimization. △ Less

Submitted 10 October, 2018; v1 submitted 22 June, 2018; originally announced June 2018.

Comments: Published at International Conference on Machine Learning 2018

arXiv:1705.07825 [pdf, other]

Comparing the Finite-Time Performance of Simulation-Optimization Algorithms

Authors: Naijia Dong, David J. Eckman, Matthias Poloczek, Xueqi Zhao, Shane G. Henderson

Abstract: We empirically evaluate the finite-time performance of several simulation-optimization algorithms on a testbed of problems with the goal of motivating further development of algorithms with strong finite-time performance. We investigate if the observed performance of the algorithms can be explained by properties of the problems, e.g., the number of decision variables, the topology of the objective… ▽ More We empirically evaluate the finite-time performance of several simulation-optimization algorithms on a testbed of problems with the goal of motivating further development of algorithms with strong finite-time performance. We investigate if the observed performance of the algorithms can be explained by properties of the problems, e.g., the number of decision variables, the topology of the objective function, or the magnitude of the simulation error. △ Less

Submitted 22 May, 2017; originally announced May 2017.

Comments: The source codes of benchmarks and algorithms are available at https://bitbucket.org/poloczek/finitetimesimopt

arXiv:1703.04389 [pdf, other]

Bayesian Optimization with Gradients

Authors: Jian Wu, Matthias Poloczek, Andrew Gordon Wilson, Peter I. Frazier

Abstract: Bayesian optimization has been successful at global optimization of expensive-to-evaluate multimodal objective functions. However, unlike most optimization methods, Bayesian optimization typically does not use derivative information. In this paper we show how Bayesian optimization can exploit derivative information to decrease the number of objective function evaluations required for good performa… ▽ More Bayesian optimization has been successful at global optimization of expensive-to-evaluate multimodal objective functions. However, unlike most optimization methods, Bayesian optimization typically does not use derivative information. In this paper we show how Bayesian optimization can exploit derivative information to decrease the number of objective function evaluations required for good performance. In particular, we develop a novel Bayesian optimization algorithm, the derivative-enabled knowledge-gradient (dKG), for which we show one-step Bayes-optimality, asymptotic consistency, and greater one-step value of information than is possible in the derivative-free setting. Our procedure accommodates noisy and incomplete derivative information, comes in both sequential and batch forms, and can optionally reduce the computational cost of inference through automatically selected retention of a single directional derivative. We also compute the d-KG acquisition function and its gradient using a novel fast discretization-free technique. We show d-KG provides state-of-the-art performance compared to a wide range of optimization procedures with and without gradients, on benchmarks including logistic regression, deep learning, kernel learning, and k-nearest neighbors. △ Less

Submitted 6 February, 2018; v1 submitted 13 March, 2017; originally announced March 2017.

Comments: Advances in Neural Information Processing Systems 30 (NIPS), 2017

Journal ref: Advances in Neural Information Processing Systems 30 (NIPS), 2017

arXiv:1608.03585 [pdf, other]

Warm Starting Bayesian Optimization

Authors: Matthias Poloczek, Jialei Wang, Peter I. Frazier

Abstract: We develop a framework for warm-starting Bayesian optimization, that reduces the solution time required to solve an optimization problem that is one in a sequence of related problems. This is useful when optimizing the output of a stochastic simulator that fails to provide derivative information, for which Bayesian optimization methods are well-suited. Solving sequences of related optimization pro… ▽ More We develop a framework for warm-starting Bayesian optimization, that reduces the solution time required to solve an optimization problem that is one in a sequence of related problems. This is useful when optimizing the output of a stochastic simulator that fails to provide derivative information, for which Bayesian optimization methods are well-suited. Solving sequences of related optimization problems arises when making several business decisions using one optimization model and input data collected over different time periods or markets. While many gradient-based methods can be warm started by initiating optimization at the solution to the previous problem, this warm start approach does not apply to Bayesian optimization methods, which carry a full metamodel of the objective function from iteration to iteration. Our approach builds a joint statistical model of the entire collection of related objective functions, and uses a value of information calculation to recommend points to evaluate. △ Less

Submitted 11 August, 2016; originally announced August 2016.

Comments: To Appear in the Proc. of the 2016 Winter Simulation Conference

arXiv:1603.00389 [pdf, other]

Multi-Information Source Optimization

Authors: Matthias Poloczek, Jialei Wang, Peter I. Frazier

Abstract: We consider Bayesian optimization of an expensive-to-evaluate black-box objective function, where we also have access to cheaper approximations of the objective. In general, such approximations arise in applications such as reinforcement learning, engineering, and the natural sciences, and are subject to an inherent, unknown bias. This model discrepancy is caused by an inadequate internal model th… ▽ More We consider Bayesian optimization of an expensive-to-evaluate black-box objective function, where we also have access to cheaper approximations of the objective. In general, such approximations arise in applications such as reinforcement learning, engineering, and the natural sciences, and are subject to an inherent, unknown bias. This model discrepancy is caused by an inadequate internal model that deviates from reality and can vary over the domain, making the utilization of these approximations a non-trivial task. We present a novel algorithm that provides a rigorous mathematical treatment of the uncertainties arising from model discrepancies and noisy observations. Its optimization decisions rely on a value of information analysis that extends the Knowledge Gradient factor to the setting of multiple information sources that vary in cost: each sampling decision maximizes the predicted benefit per unit cost. We conduct an experimental evaluation that demonstrates that the method consistently outperforms other state-of-the-art techniques: it finds designs of considerably higher objective value and additionally inflicts less cost in the exploration process. △ Less

Submitted 14 November, 2016; v1 submitted 1 March, 2016; originally announced March 2016.

Comments: Added: benchmark logistic regression on MNIST/USPS, comparison to MTBO/entropy search, estimation of hyper-parameters

Showing 1–10 of 10 results for author: Poloczek, M