Search | arXiv e-print repository

Multilevel Surrogate-based Control Variates

Authors: Mohamed Reda El Amri, Paul Mycek, Sophie Ricci, Matthias De Lozzo

Abstract: Monte Carlo (MC) sampling is a popular method for estimating the statistics (e.g. expectation and variance) of a random variable. Its slow convergence has led to the emergence of advanced techniques to reduce the variance of the MC estimator for the outputs of computationally expensive solvers. The control variates (CV) method corrects the MC estimator with a term derived from auxiliary random var… ▽ More Monte Carlo (MC) sampling is a popular method for estimating the statistics (e.g. expectation and variance) of a random variable. Its slow convergence has led to the emergence of advanced techniques to reduce the variance of the MC estimator for the outputs of computationally expensive solvers. The control variates (CV) method corrects the MC estimator with a term derived from auxiliary random variables that are highly correlated with the original random variable. These auxiliary variables may come from surrogate models. Such a surrogate-based CV strategy is extended here to the multilevel Monte Carlo (MLMC) framework, which relies on a sequence of levels corresponding to numerical simulators with increasing accuracy and computational cost. MLMC combines output samples obtained across levels, into a telescopic sum of differences between MC estimators for successive fidelities. In this paper, we introduce three multilevel variance reduction strategies that rely on surrogate-based CV and MLMC. MLCV is presented as an extension of CV where the correction terms devised from surrogate models for simulators of different levels add up. MLMC-CV improves the MLMC estimator by using a CV based on a surrogate of the correction term at each level. Further variance reduction is achieved by using the surrogate-based CVs of all the levels in the MLMC-MLCV strategy. Alternative solutions that reduce the subset of surrogates used for the multilevel estimation are also introduced. The proposed methods are tested on a test case from the literature consisting of a spectral discretization of an uncertain 1D heat equation, where the statistic of interest is the expected value of the integrated temperature along the domain at a given time. The results are assessed in terms of the accuracy and computational cost of the multilevel estimators, depending on whether the construction of the surrogates, and the associated computational cost, precede the evaluation of the estimator. It was shown that when the lower fidelity outputs are strongly correlated with the high-fidelity outputs, a significant variance reduction is obtained when using surrogate models for the coarser levels only. It was also shown that taking advantage of pre-existing surrogate models proves to be an even more efficient strategy. △ Less

Submitted 19 June, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

arXiv:2303.01371 [pdf, other]

A scalable problem to benchmark robust multidisciplinary design optimization techniques

Authors: A Aziz-Alaoui, O Roustant, M de Lozzo

Abstract: A scalable problem to benchmark robust multidisciplinary design optimization algorithms (RMDO) is proposed. This allows the user to choose the number of disciplines, the dimensions of the coupling and design variables and the extent of the feasible domain. After a description of the mathematical background, a deterministic version of the scalable problem is defined and the conditions on the existe… ▽ More A scalable problem to benchmark robust multidisciplinary design optimization algorithms (RMDO) is proposed. This allows the user to choose the number of disciplines, the dimensions of the coupling and design variables and the extent of the feasible domain. After a description of the mathematical background, a deterministic version of the scalable problem is defined and the conditions on the existence and uniqueness of the solution are given. Then, this deterministic scalable problem is made uncertain by adding random variables to the coupling equations. Under classical assumptions, the existence and uniqueness of the solution of this RMDO problem is guaranteed. This solution can be easily computed with a quadratic programming algorithm and serves as a reference to assess the performances of RMDO algorithms. This scalable problem has been implemented in the open source software GEMSEO and tested with two techniques of statistics estimation: Monte-Carlo sampling and Taylor polynomials. △ Less

Submitted 27 February, 2023; originally announced March 2023.

arXiv:1705.01440 [pdf, other]

doi 10.1007/s00477-017-1470-4

Comparison of Polynomial Chaos and Gaussian Process surrogates for uncertainty quantification and correlation estimation of spatially distributed open-channel steady flows

Authors: Pamphile Tupui Roy, Nabil El Moçayd, Sophie Ricci, Jean-Christophe Jouhaud, Nicole Goutal, Matthias De Lozzo, Mélanie C. Rochoux

Abstract: Data assimilation is widely used to improve flood forecasting capability, especially through parameter inference requiring statistical information on the uncertain input parameters (upstream discharge, friction coefficient) as well as on the variability of the water level and its sensitivity with respect to the inputs. For particle filter or ensemble Kalman filter, stochastically estimating probab… ▽ More Data assimilation is widely used to improve flood forecasting capability, especially through parameter inference requiring statistical information on the uncertain input parameters (upstream discharge, friction coefficient) as well as on the variability of the water level and its sensitivity with respect to the inputs. For particle filter or ensemble Kalman filter, stochastically estimating probability density function and covariance matrices from a Monte Carlo random sampling requires a large ensemble of model evaluations, limiting their use in real-time application. To tackle this issue, fast surrogate models based on Polynomial Chaos and Gaussian Process can be used to represent the spatially distributed water level in place of solving the shallow water equations. This study investigates the use of these surrogates to estimate probability density functions and covariance matrices at a reduced computational cost and without the loss of accuracy, in the perspective of ensemble-based data assimilation. This study focuses on 1-D steady state flow simulated with MASCARET over the Garonne River (South-West France). Results show that both surrogates feature similar performance to the Monte-Carlo random sampling, but for a much smaller computational budget; a few MASCARET simulations (on the order of 10-100) are sufficient to accurately retrieve covariance matrices and probability density functions all along the river, even where the flow dynamic is more complex due to heterogeneous bathymetry. This paves the way for the design of surrogate strategies suitable for representing unsteady open-channel flows in data assimilation. △ Less

Submitted 17 October, 2017; v1 submitted 3 May, 2017; originally announced May 2017.

arXiv:1412.1414 [pdf, other]

New improvements in the use of dependence measures for sensitivity analysis and screening

Authors: Matthias De Lozzo, Amandine Marrel

Abstract: Physical phenomena are commonly modeled by numerical simulators. Such codes can take as input a high number of uncertain parameters and it is important to identify their influences via a global sensitivity analysis (GSA). However, these codes can be time consuming which prevents a GSA based on the classical Sobol' indices, requiring too many simulations. This is especially true as the number of in… ▽ More Physical phenomena are commonly modeled by numerical simulators. Such codes can take as input a high number of uncertain parameters and it is important to identify their influences via a global sensitivity analysis (GSA). However, these codes can be time consuming which prevents a GSA based on the classical Sobol' indices, requiring too many simulations. This is especially true as the number of inputs is important. To address this limitation, we consider recent advances in dependence measures, focusing on the distance correlation and the Hilbert-Schmidt independence criterion (HSIC). Our objective is to study these indices and use them for a screening purpose. Numerical tests reveal some differences between dependence measures and classical Sobol' indices, and preliminary answers to "What sensitivity indices to what situation?" are derived. Then, two approaches are proposed to use the dependence measures for a screening purpose. The first one directly uses these indices with independence tests; asymptotic tests and their spectral extensions exist and are detailed. For a higher accuracy in presence of small samples, we propose a non-asymptotic version based on bootstrap sampling. The second approach is based on a linear model associating two simulations, which explains their output difference as a weighed sum of their input differences. From this, a bootstrap method is proposed for the selection of the influential inputs. We also propose a heuristic approach for the calibration of the HSIC Lasso method. Numerical experiments are performed and show the potential of these approaches for screening when many inputs are not influential. △ Less

Submitted 3 December, 2014; originally announced December 2014.

Showing 1–4 of 4 results for author: De Lozzo, M