-
Emulating complex dynamical simulators with random Fourier features
Authors:
Hossein Mohammadi,
Peter Challenor,
Marc Goodfellow
Abstract:
A Gaussian process (GP)-based methodology is proposed to emulate complex dynamical computer models (or simulators). The method relies on emulating the numerical flow map of the system over an initial (short) time step, where the flow map is a function that describes the evolution of the system from an initial condition to a subsequent value at the next time step. This yields a probabilistic distri…
▽ More
A Gaussian process (GP)-based methodology is proposed to emulate complex dynamical computer models (or simulators). The method relies on emulating the numerical flow map of the system over an initial (short) time step, where the flow map is a function that describes the evolution of the system from an initial condition to a subsequent value at the next time step. This yields a probabilistic distribution over the entire flow map function, with each draw offering an approximation to the flow map. The model output times series is then predicted (under the Markov assumption) by drawing a sample from the emulated flow map (i.e., its posterior distribution) and using it to iterate from the initial condition ahead in time. Repeating this procedure with multiple such draws creates a distribution over the time series. The mean and variance of this distribution at a specific time point serve as the model output prediction and the associated uncertainty, respectively. However, drawing a GP posterior sample that represents the underlying function across its entire domain is computationally infeasible, given the infinite-dimensional nature of this object. To overcome this limitation, one can generate such a sample in an approximate manner using random Fourier features (RFF). RFF is an efficient technique for approximating the kernel and generating GP samples, offering both computational efficiency and theoretical guarantees. The proposed method is applied to emulate several dynamic nonlinear simulators including the well-known Lorenz and van der Pol models. The results suggest that our approach has a promising predictive performance and the associated uncertainty can capture the dynamics of the system appropriately.
△ Less
Submitted 25 November, 2024; v1 submitted 15 April, 2021;
originally announced April 2021.
-
Cross-validation based adaptive sampling for Gaussian process models
Authors:
Hossein Mohammadi,
Peter Challenor,
Daniel Williamson,
Marc Goodfellow
Abstract:
In many real-world applications, we are interested in approximating black-box, costly functions as accurately as possible with the smallest number of function evaluations. A complex computer code is an example of such a function. In this work, a Gaussian process (GP) emulator is used to approximate the output of complex computer code. We consider the problem of extending an initial experiment (set…
▽ More
In many real-world applications, we are interested in approximating black-box, costly functions as accurately as possible with the smallest number of function evaluations. A complex computer code is an example of such a function. In this work, a Gaussian process (GP) emulator is used to approximate the output of complex computer code. We consider the problem of extending an initial experiment (set of model runs) sequentially to improve the emulator. A sequential sampling approach based on leave-one-out (LOO) cross-validation is proposed that can be easily extended to a batch mode. This is a desirable property since it saves the user time when parallel computing is available. After fitting a GP to training data points, the expected squared LOO (ES-LOO) error is calculated at each design point. ES-LOO is used as a measure to identify important data points. More precisely, when this quantity is large at a point it means that the quality of prediction depends a great deal on that point and adding more samples nearby could improve the accuracy of the GP. As a result, it is reasonable to select the next sample where ES-LOO is maximised. However, ES-LOO is only known at the experimental design and needs to be estimated at unobserved points. To do this, a second GP is fitted to the ES-LOO errors and where the maximum of the modified expected improvement (EI) criterion occurs is chosen as the next sample. EI is a popular acquisition function in Bayesian optimisation and is used to trade-off between local/global search. However, it has a tendency towards exploitation, meaning that its maximum is close to the (current) "best" sample. To avoid clustering, a modified version of EI, called pseudo expected improvement, is employed which is more explorative than EI yet allows us to discover unexplored regions. Our results show that the proposed sampling method is promising.
△ Less
Submitted 15 October, 2021; v1 submitted 4 May, 2020;
originally announced May 2020.
-
Emulating computer models with step-discontinuous outputs using Gaussian processes
Authors:
Hossein Mohammadi,
Peter Challenor,
Marc Goodfellow,
Daniel Williamson
Abstract:
In many real-world applications we are interested in approximating costly functions that are analytically unknown, e.g. complex computer codes. An emulator provides a fast approximation of such functions relying on a limited number of evaluations. Gaussian processes (GPs) are commonplace emulators due to their statistical properties such as the ability to estimate their own uncertainty. GPs are es…
▽ More
In many real-world applications we are interested in approximating costly functions that are analytically unknown, e.g. complex computer codes. An emulator provides a fast approximation of such functions relying on a limited number of evaluations. Gaussian processes (GPs) are commonplace emulators due to their statistical properties such as the ability to estimate their own uncertainty. GPs are essentially developed to fit smooth, continuous functions. However, the assumptions of continuity and smoothness is unwarranted in many situations. For example, in computer models where bifurcations or tipping points occur, the outputs can be discontinuous. This work examines the capacity of GPs for emulating step-discontinuous functions. Several approaches are proposed for this purpose. Two special covariance functions/kernels are adapted with the ability to model discontinuities. They are the neural network and Gibbs kernels whose properties are demonstrated using several examples. Another approach, which is called warping, is to transform the input space into a new space where a GP with a standard kernel, such as the Matern family, is able to predict the function well. The transformation is perform by a parametric map whose parameters are estimated by maximum likelihood. The results show that the proposed approaches have superior performance to GPs with standard kernels in capturing sharp jumps in the true function.
△ Less
Submitted 30 September, 2020; v1 submitted 5 March, 2019;
originally announced March 2019.
-
Emulating dynamic non-linear simulators using Gaussian processes
Authors:
Hossein Mohammadi,
Peter Challenor,
Marc Goodfellow
Abstract:
The dynamic emulation of non-linear deterministic computer codes where the output is a time series, possibly multivariate, is examined. Such computer models simulate the evolution of some real-world phenomenon over time, for example models of the climate or the functioning of the human brain. The models we are interested in are highly non-linear and exhibit tipping points, bifurcations and chaotic…
▽ More
The dynamic emulation of non-linear deterministic computer codes where the output is a time series, possibly multivariate, is examined. Such computer models simulate the evolution of some real-world phenomenon over time, for example models of the climate or the functioning of the human brain. The models we are interested in are highly non-linear and exhibit tipping points, bifurcations and chaotic behaviour. However, each simulation run could be too time-consuming to perform analyses that require many runs, including quantifying the variation in model output with respect to changes in the inputs. Therefore, Gaussian process emulators are used to approximate the output of the code. To do this, the flow map of the system under study is emulated over a short time period. Then, it is used in an iterative way to predict the whole time series. A number of ways are proposed to take into account the uncertainty of inputs to the emulators, after fixed initial conditions, and the correlation between them through the time series. The methodology is illustrated with two examples: the highly non-linear dynamical systems described by the Lorenz and Van der Pol equations. In both cases, the predictive performance is relatively high and the measure of uncertainty provided by the method reflects the extent of predictability in each system.
△ Less
Submitted 18 February, 2019; v1 submitted 21 February, 2018;
originally announced February 2018.