-
General reproducing properties in RKHS with application to derivative and integral operators
Authors:
Fatima-Zahrae El-Boukkouri,
Josselin Garnier,
Olivier Roustant
Abstract:
In this paper, we consider the reproducing property in Reproducing Kernel Hilbert Spaces (RKHS). We establish a reproducing property for the closure of the class of combinations of composition operators under minimal conditions. This allows to revisit the sufficient conditions for the reproducing property to hold for the derivative operator, as well as for the existence of the mean embedding funct…
▽ More
In this paper, we consider the reproducing property in Reproducing Kernel Hilbert Spaces (RKHS). We establish a reproducing property for the closure of the class of combinations of composition operators under minimal conditions. This allows to revisit the sufficient conditions for the reproducing property to hold for the derivative operator, as well as for the existence of the mean embedding function. These results provide a framework of application of the representer theorem for regularized learning algorithms that involve data for function values, gradients, or any other operator from the considered class.
△ Less
Submitted 31 March, 2025; v1 submitted 20 March, 2025;
originally announced March 2025.
-
Block-Additive Gaussian Processes under Monotonicity Constraints
Authors:
M. Deronzier,
A. F. López-Lopera,
F. Bachoc,
O. Roustant,
J. Rohmer
Abstract:
We generalize the additive constrained Gaussian process framework to handle interactions between input variables while enforcing monotonicity constraints everywhere on the input space. The block-additive structure of the model is particularly suitable in the presence of interactions, while maintaining tractable computations. In addition, we develop a sequential algorithm, MaxMod, for model selecti…
▽ More
We generalize the additive constrained Gaussian process framework to handle interactions between input variables while enforcing monotonicity constraints everywhere on the input space. The block-additive structure of the model is particularly suitable in the presence of interactions, while maintaining tractable computations. In addition, we develop a sequential algorithm, MaxMod, for model selection (i.e., the choice of the active input variables and of the blocks). We speed up our implementations through efficient matrix computations and thanks to explicit expressions of criteria involved in MaxMod. The performance and scalability of our methodology are showcased with several numerical examples in dimensions up to 120, as well as in a 5D real-world coastal flooding application, where interpretability is enhanced by the selection of the blocks.
△ Less
Submitted 21 January, 2025; v1 submitted 18 July, 2024;
originally announced July 2024.
-
High-dimensional additive Gaussian processes under monotonicity constraints
Authors:
Andrés F. López-Lopera,
François Bachoc,
Olivier Roustant
Abstract:
We introduce an additive Gaussian process framework accounting for monotonicity constraints and scalable to high dimensions. Our contributions are threefold. First, we show that our framework enables to satisfy the constraints everywhere in the input space. We also show that more general componentwise linear inequality constraints can be handled similarly, such as componentwise convexity. Second,…
▽ More
We introduce an additive Gaussian process framework accounting for monotonicity constraints and scalable to high dimensions. Our contributions are threefold. First, we show that our framework enables to satisfy the constraints everywhere in the input space. We also show that more general componentwise linear inequality constraints can be handled similarly, such as componentwise convexity. Second, we propose the additive MaxMod algorithm for sequential dimension reduction. By sequentially maximizing a squared-norm criterion, MaxMod identifies the active input dimensions and refines the most important ones. This criterion can be computed explicitly at a linear cost. Finally, we provide open-source codes for our full framework. We demonstrate the performance and scalability of the methodology in several synthetic examples with hundreds of dimensions under monotonicity constraints as well as on a real-world flood application.
△ Less
Submitted 17 May, 2022;
originally announced May 2022.
-
A comparison of mixed-variables Bayesian optimization approaches
Authors:
Jhouben Cuesta-Ramirez,
Rodolphe Le Riche,
Olivier Roustant,
Guillaume Perrin,
Cedric Durantin,
Alain Gliere
Abstract:
Most real optimization problems are defined over a mixed search space where the variables are both discrete and continuous. In engineering applications, the objective function is typically calculated with a numerically costly black-box simulation.General mixed and costly optimization problems are therefore of a great practical interest, yet their resolution remains in a large part an open scientif…
▽ More
Most real optimization problems are defined over a mixed search space where the variables are both discrete and continuous. In engineering applications, the objective function is typically calculated with a numerically costly black-box simulation.General mixed and costly optimization problems are therefore of a great practical interest, yet their resolution remains in a large part an open scientific question. In this article, costly mixed problems are approached through Gaussian processes where the discrete variables are relaxed into continuous latent variables. The continuous space is more easily harvested by classical Bayesian optimization techniques than a mixed space would. Discrete variables are recovered either subsequently to the continuous optimization, or simultaneously with an additional continuous-discrete compatibility constraint that is handled with augmented Lagrangians. Several possible implementations of such Bayesian mixed optimizers are compared. In particular, the reformulation of the problem with continuous latent variables is put in competition with searches working directly in the mixed space. Among the algorithms involving latent variables and an augmented Lagrangian, a particular attention is devoted to the Lagrange multipliers for which a local and a global estimation techniques are studied. The comparisons are based on the repeated optimization of three analytical functions and a beam design problem.
△ Less
Submitted 3 May, 2022; v1 submitted 30 October, 2021;
originally announced November 2021.
-
Global sensitivity analysis using derivative-based sparse Poincaré chaos expansions
Authors:
Nora Lüthen,
Olivier Roustant,
Fabrice Gamboa,
Bertrand Iooss,
Stefano Marelli,
Bruno Sudret
Abstract:
Variance-based global sensitivity analysis, in particular Sobol' analysis, is widely used for determining the importance of input variables to a computational model. Sobol' indices can be computed cheaply based on spectral methods like polynomial chaos expansions (PCE). Another choice are the recently developed Poincaré chaos expansions (PoinCE), whose orthonormal tensor-product basis is generated…
▽ More
Variance-based global sensitivity analysis, in particular Sobol' analysis, is widely used for determining the importance of input variables to a computational model. Sobol' indices can be computed cheaply based on spectral methods like polynomial chaos expansions (PCE). Another choice are the recently developed Poincaré chaos expansions (PoinCE), whose orthonormal tensor-product basis is generated from the eigenfunctions of one-dimensional Poincaré differential operators. In this paper, we show that the Poincaré basis is the unique orthonormal basis with the property that partial derivatives of the basis form again an orthogonal basis with respect to the same measure as the original basis. This special property makes PoinCE ideally suited for incorporating derivative information into the surrogate modelling process. Assuming that partial derivative evaluations of the computational model are available, we compute spectral expansions in terms of Poincaré basis functions or basis partial derivatives, respectively, by sparse regression. We show on two numerical examples that the derivative-based expansions provide accurate estimates for Sobol' indices, even outperforming PCE in terms of bias and variance. In addition, we derive an analytical expression based on the PoinCE coefficients for a second popular sensitivity index, the derivative-based sensitivity measure (DGSM), and explore its performance as upper bound to the corresponding total Sobol' indices.
△ Less
Submitted 9 June, 2023; v1 submitted 1 July, 2021;
originally announced July 2021.
-
Functional principal component analysis for global sensitivity analysis of model with spatial output
Authors:
T. V. E. Perrin,
O. Roustant,
J. Rohmer,
O. Alata,
J. P. Naulin,
D. Idier,
R. Pedreros,
D. Moncoulon,
P. Tinard
Abstract:
Motivated by risk assessment of coastal flooding, we consider time-consuming simulators with a spatial output. The aim is to perform sensitivity analysis (SA), quantifying the influence of input parameters on the output. There are three main issues. First, due to computational time, standard SA techniques cannot be directly applied on the simulator. Second, the output is infinite dimensional, or a…
▽ More
Motivated by risk assessment of coastal flooding, we consider time-consuming simulators with a spatial output. The aim is to perform sensitivity analysis (SA), quantifying the influence of input parameters on the output. There are three main issues. First, due to computational time, standard SA techniques cannot be directly applied on the simulator. Second, the output is infinite dimensional, or at least high dimensional if the output is discretized. Third, the spatial output is non-stationary and exhibits strong local variations. We show that all these issues can be addressed all together by using functional PCA (FPCA). We first specify a functional basis, such as wavelets or B-splines, designed to handle local variations. Secondly, we select the most influential basis terms, either with an energy criterion after basis orthonormalization, or directly on the original basis with a penalized regression approach. Then FPCA further reduces dimension by doing PCA on the most influential basis coefficients, with an ad-hoc metric. Finally, fast-to-evaluate metamodels are built on the few selected principal components. They provide a proxy on which SA can be done. As a by-product, we obtain analytical formulas for variance-based sensitivity indices, generalizing known formula assuming orthonormality of basis functions.
△ Less
Submitted 4 January, 2021; v1 submitted 20 May, 2020;
originally announced May 2020.
-
Approximating Gaussian Process Emulators with Linear Inequality Constraints and Noisy Observations via MC and MCMC
Authors:
Andrés F. López-Lopera,
François Bachoc,
Nicolas Durrande,
Jérémy Rohmer,
Déborah Idier,
Olivier Roustant
Abstract:
Adding inequality constraints (e.g. boundedness, monotonicity, convexity) into Gaussian processes (GPs) can lead to more realistic stochastic emulators. Due to the truncated Gaussianity of the posterior, its distribution has to be approximated. In this work, we consider Monte Carlo (MC) and Markov Chain Monte Carlo (MCMC) methods. However, strictly interpolating the observations may entail expensi…
▽ More
Adding inequality constraints (e.g. boundedness, monotonicity, convexity) into Gaussian processes (GPs) can lead to more realistic stochastic emulators. Due to the truncated Gaussianity of the posterior, its distribution has to be approximated. In this work, we consider Monte Carlo (MC) and Markov Chain Monte Carlo (MCMC) methods. However, strictly interpolating the observations may entail expensive computations due to highly restrictive sample spaces. Furthermore, having (constrained) GP emulators when data are actually noisy is also of interest for real-world implementations. Hence, we introduce a noise term for the relaxation of the interpolation conditions, and we develop the corresponding approximation of GP emulators under linear inequality constraints. We show with various toy examples that the performance of MC and MCMC samplers improves when considering noisy observations. Finally, on 2D and 5D coastal flooding applications, we show that more flexible and realistic GP implementations can be obtained by considering noise effects and by enforcing the (linear) inequality constraints.
△ Less
Submitted 21 June, 2019; v1 submitted 15 January, 2019;
originally announced January 2019.
-
Group kernels for Gaussian process metamodels with categorical inputs
Authors:
Olivier Roustant,
Esperan Padonou,
Yves Deville,
Aloïs Clément,
Guillaume Perrin,
Jean Giorla,
Henry Wynn
Abstract:
Gaussian processes (GP) are widely used as a metamodel for emulating time-consuming computer codes. We focus on problems involving categorical inputs, with a potentially large number L of levels (typically several tens), partitioned in G << L groups of various sizes. Parsimonious covariance functions, or kernels, can then be defined by block covariance matrices T with constant covariances between…
▽ More
Gaussian processes (GP) are widely used as a metamodel for emulating time-consuming computer codes. We focus on problems involving categorical inputs, with a potentially large number L of levels (typically several tens), partitioned in G << L groups of various sizes. Parsimonious covariance functions, or kernels, can then be defined by block covariance matrices T with constant covariances between pairs of blocks and within blocks. We study the positive definiteness of such matrices to encourage their practical use. The hierarchical group/level structure, equivalent to a nested Bayesian linear model, provides a parameterization of valid block matrices T. The same model can then be used when the assumption within blocks is relaxed, giving a flexible parametric family of valid covariance matrices with constant covariances between pairs of blocks. The positive definiteness of T is equivalent to the positive definiteness of a smaller matrix of size G, obtained by averaging each block. The model is applied to a problem in nuclear waste analysis, where one of the categorical inputs is atomic number, which has more than 90 levels.
△ Less
Submitted 24 July, 2018; v1 submitted 7 February, 2018;
originally announced February 2018.
-
Finite-dimensional Gaussian approximation with linear inequality constraints
Authors:
Andrés F. López-Lopera,
François Bachoc,
Nicolas Durrande,
Olivier Roustant
Abstract:
Introducing inequality constraints in Gaussian process (GP) models can lead to more realistic uncertainties in learning a great variety of real-world problems. We consider the finite-dimensional Gaussian approach from Maatouk and Bay (2017) which can satisfy inequality conditions everywhere (either boundedness, monotonicity or convexity). Our contributions are threefold. First, we extend their app…
▽ More
Introducing inequality constraints in Gaussian process (GP) models can lead to more realistic uncertainties in learning a great variety of real-world problems. We consider the finite-dimensional Gaussian approach from Maatouk and Bay (2017) which can satisfy inequality conditions everywhere (either boundedness, monotonicity or convexity). Our contributions are threefold. First, we extend their approach in order to deal with general sets of linear inequalities. Second, we explore several Markov Chain Monte Carlo (MCMC) techniques to approximate the posterior distribution. Third, we investigate theoretical and numerical properties of the constrained likelihood for covariance parameter estimation. According to experiments on both artificial and real data, our full framework together with a Hamiltonian Monte Carlo-based sampler provides efficient results on both data fitting and uncertainty quantification.
△ Less
Submitted 20 October, 2017;
originally announced October 2017.
-
On the choice of the low-dimensional domain for global optimization via random embeddings
Authors:
Mickaël Binois,
David Ginsbourger,
Olivier Roustant
Abstract:
The challenge of taking many variables into account in optimization problems may be overcome under the hypothesis of low effective dimensionality. Then, the search of solutions can be reduced to the random embedding of a low dimensional space into the original one, resulting in a more manageable optimization problem. Specifically, in the case of time consuming black-box functions and when the budg…
▽ More
The challenge of taking many variables into account in optimization problems may be overcome under the hypothesis of low effective dimensionality. Then, the search of solutions can be reduced to the random embedding of a low dimensional space into the original one, resulting in a more manageable optimization problem. Specifically, in the case of time consuming black-box functions and when the budget of evaluations is severely limited, global optimization with random embeddings appears as a sound alternative to random search. Yet, in the case of box constraints on the native variables, defining suitable bounds on a low dimensional domain appears to be complex. Indeed, a small search domain does not guarantee to find a solution even under restrictive hypotheses about the function, while a larger one may slow down convergence dramatically. Here we tackle the issue of low-dimensional domain selection based on a detailed study of the properties of the random embedding, giving insight on the aforementioned difficulties. In particular, we describe a minimal low-dimensional set in correspondence with the embedded search space. We additionally show that an alternative equivalent embedding procedure yields simultaneously a simpler definition of the low-dimensional minimal set and better properties in practice. Finally, the performance and robustness gains of the proposed enhancements for Bayesian optimization are illustrated on numerical examples.
△ Less
Submitted 22 October, 2018; v1 submitted 18 April, 2017;
originally announced April 2017.
-
Poincaré inequalities on intervals -- application to sensitivity analysis
Authors:
Olivier Roustant,
Franck Barthe,
Bertrand Iooss
Abstract:
The development of global sensitivity analysis of numerical model outputs has recently raised new issues on 1-dimensional Poincaré inequalities. Typically two kind of sensitivity indices are linked by a Poincaré type inequality, which provide upper bounds of the most interpretable index by using the other one, cheaper to compute. This allows performing a low-cost screening of unessential variables…
▽ More
The development of global sensitivity analysis of numerical model outputs has recently raised new issues on 1-dimensional Poincaré inequalities. Typically two kind of sensitivity indices are linked by a Poincaré type inequality, which provide upper bounds of the most interpretable index by using the other one, cheaper to compute. This allows performing a low-cost screening of unessential variables. The efficiency of this screening then highly depends on the accuracy of the upper bounds in Poincaré inequalities. The novelty in the questions concern the wide range of probability distributions involved, which are often truncated on intervals. After providing an overview of the existing knowledge and techniques, we add some theory about Poincaré constants on intervals, with improvements for symmetric intervals. Then we exploit the spectral interpretation for computing exact value of Poincaré constants of any admissible distribution on a given interval. We give semi-analytical results for some frequent distributions (truncated exponential, triangular, truncated normal), and present a numerical method in the general case. Finally, an application is made to a hydrological problem, showing the benefits of the new results in Poincaré inequalities to sensitivity analysis.
△ Less
Submitted 12 December, 2016;
originally announced December 2016.
-
Universal Prediction Distribution for Surrogate Models
Authors:
Malek Ben Salem,
Olivier Roustant,
Fabrice Gamboa,
Lionel Tomaso
Abstract:
The use of surrogate models instead of computationally expensive simulation codes is very convenient in engineering. Roughly speaking, there are two kinds of surrogate models: the deterministic and the probabilistic ones. These last are generally based on Gaussian assumptions. The main advantage of probabilistic approach is that it provides a measure of uncertainty associated with the surrogate…
▽ More
The use of surrogate models instead of computationally expensive simulation codes is very convenient in engineering. Roughly speaking, there are two kinds of surrogate models: the deterministic and the probabilistic ones. These last are generally based on Gaussian assumptions. The main advantage of probabilistic approach is that it provides a measure of uncertainty associated with the surrogate model in the whole space. This uncertainty is an efficient tool to construct strategies for various problems such as prediction enhancement, optimization or inversion.In this paper, we propose a universal method to define a measure of uncertainty suitable for any surrogate model either deterministic or probabilistic. It relies on Cross-Validation (CV) sub-models predictions. This empirical distribution may be computed in much more general frames than the Gaussian one. So that it is called the Universal Prediction distribution (UP distribution).It allows the definition of many sampling criteria. We give and study adaptive sampling techniques for global refinement and an extension of the so-called Efficient Global Optimization (EGO) algorithm. We also discuss the use of the UP distribution for inversion problems. The performances of these new algorithms are studied both on toys models and on an engineering design problem.
△ Less
Submitted 23 December, 2015;
originally announced December 2015.
-
A warped kernel improving robustness in Bayesian optimization via random embeddings
Authors:
Mickaël Binois,
David Ginsbourger,
Olivier Roustant
Abstract:
This works extends the Random Embedding Bayesian Optimization approach by integrating a warping of the high dimensional subspace within the covariance kernel. The proposed warping, that relies on elementary geometric considerations, allows mitigating the drawbacks of the high extrinsic dimensionality while avoiding the algorithm to evaluate points giving redundant information. It also alleviates c…
▽ More
This works extends the Random Embedding Bayesian Optimization approach by integrating a warping of the high dimensional subspace within the covariance kernel. The proposed warping, that relies on elementary geometric considerations, allows mitigating the drawbacks of the high extrinsic dimensionality while avoiding the algorithm to evaluate points giving redundant information. It also alleviates constraints on bound selection for the embedded domain, thus improving the robustness, as illustrated with a test case with 25 variables and intrinsic dimension 6.
△ Less
Submitted 18 March, 2015; v1 submitted 13 November, 2014;
originally announced November 2014.
-
Computer experiments with functional inputs and scalar outputs by a norm-based approach
Authors:
Thomas Muehlenstaedt,
Jana Fruth,
Olivier Roustant
Abstract:
A framework for designing and analyzing computer experiments is presented, which is constructed for dealing with functional and real number inputs and real number outputs. For designing experiments with both functional and real number inputs a two stage approach is suggested. The first stage consists of constructing a candidate set for each functional input and during the second stage an optimal c…
▽ More
A framework for designing and analyzing computer experiments is presented, which is constructed for dealing with functional and real number inputs and real number outputs. For designing experiments with both functional and real number inputs a two stage approach is suggested. The first stage consists of constructing a candidate set for each functional input and during the second stage an optimal combination of the found candidate sets and a Latin hypercube for the real number inputs is searched for. The resulting designs can be considered to be generalizations of Latin hypercubes. GP models are explored as metamodel. The functional inputs are incorporated into the kriging model by applying norms in order to define distances between two functional inputs. In order to make the calculation of these norms computationally feasible, the use of B-splines is promoted.
△ Less
Submitted 1 October, 2014;
originally announced October 2014.
-
Invariances of random fields paths, with applications in Gaussian Process Regression
Authors:
David Ginsbourger,
Olivier Roustant,
Nicolas Durrande
Abstract:
We study pathwise invariances of centred random fields that can be controlled through the covariance. A result involving composition operators is obtained in second-order settings, and we show that various path properties including additivity boil down to invariances of the covariance kernel. These results are extended to a broader class of operators in the Gaussian case, via the Loève isometry. S…
▽ More
We study pathwise invariances of centred random fields that can be controlled through the covariance. A result involving composition operators is obtained in second-order settings, and we show that various path properties including additivity boil down to invariances of the covariance kernel. These results are extended to a broader class of operators in the Gaussian case, via the Loève isometry. Several covariance-driven pathwise invariances are illustrated, including fields with symmetric paths, centred paths, harmonic paths, or sparse paths. The proposed approach delivers a number of promising results and perspectives in Gaussian process regression.
△ Less
Submitted 6 August, 2013;
originally announced August 2013.
-
Additive Covariance Kernels for High-Dimensional Gaussian Process Modeling
Authors:
Nicolas Durrande,
David Ginsbourger,
Olivier Roustant,
Laurent Carraro
Abstract:
Gaussian process models -also called Kriging models- are often used as mathematical approximations of expensive experiments. However, the number of observation required for building an emulator becomes unrealistic when using classical covariance kernels when the dimension of input increases. In oder to get round the curse of dimensionality, a popular approach is to consider simplified models such…
▽ More
Gaussian process models -also called Kriging models- are often used as mathematical approximations of expensive experiments. However, the number of observation required for building an emulator becomes unrealistic when using classical covariance kernels when the dimension of input increases. In oder to get round the curse of dimensionality, a popular approach is to consider simplified models such as additive models. The ambition of the present work is to give an insight into covariance kernels that are well suited for building additive Kriging models and to describe some properties of the resulting models.
△ Less
Submitted 27 November, 2011;
originally announced November 2011.
-
ANOVA kernels and RKHS of zero mean functions for model-based sensitivity analysis
Authors:
Nicolas Durrande,
David Ginsbourger,
Olivier Roustant,
Laurent Carraro
Abstract:
Given a reproducing kernel Hilbert space H of real-valued functions and a suitable measure mu over the source space D (subset of R), we decompose H as the sum of a subspace of centered functions for mu and its orthogonal in H. This decomposition leads to a special case of ANOVA kernels, for which the functional ANOVA representation of the best predictor can be elegantly derived, either in an inter…
▽ More
Given a reproducing kernel Hilbert space H of real-valued functions and a suitable measure mu over the source space D (subset of R), we decompose H as the sum of a subspace of centered functions for mu and its orthogonal in H. This decomposition leads to a special case of ANOVA kernels, for which the functional ANOVA representation of the best predictor can be elegantly derived, either in an interpolation or regularization framework. The proposed kernels appear to be particularly convenient for analyzing the e ffect of each (group of) variable(s) and computing sensitivity indices without recursivity.
△ Less
Submitted 7 December, 2012; v1 submitted 17 June, 2011;
originally announced June 2011.
-
Additive Kernels for Gaussian Process Modeling
Authors:
Nicolas Durrande,
David Ginsbourger,
Olivier Roustant
Abstract:
Gaussian Process (GP) models are often used as mathematical approximations of computationally expensive experiments. Provided that its kernel is suitably chosen and that enough data is available to obtain a reasonable fit of the simulator, a GP model can beneficially be used for tasks such as prediction, optimization, or Monte-Carlo-based quantification of uncertainty. However, the former conditio…
▽ More
Gaussian Process (GP) models are often used as mathematical approximations of computationally expensive experiments. Provided that its kernel is suitably chosen and that enough data is available to obtain a reasonable fit of the simulator, a GP model can beneficially be used for tasks such as prediction, optimization, or Monte-Carlo-based quantification of uncertainty. However, the former conditions become unrealistic when using classical GPs as the dimension of input increases. One popular alternative is then to turn to Generalized Additive Models (GAMs), relying on the assumption that the simulator's response can approximately be decomposed as a sum of univariate functions. If such an approach has been successfully applied in approximation, it is nevertheless not completely compatible with the GP framework and its versatile applications. The ambition of the present work is to give an insight into the use of GPs for additive models by integrating additivity within the kernel, and proposing a parsimonious numerical method for data-driven parameter estimation. The first part of this article deals with the kernels naturally associated to additive processes and the properties of the GP models based on such kernels. The second part is dedicated to a numerical procedure based on relaxation for additive kernel parameter estimation. Finally, the efficiency of the proposed method is illustrated and compared to other approaches on Sobol's g-function.
△ Less
Submitted 21 March, 2011;
originally announced March 2011.
-
Calculations of Sobol indices for the Gaussian process metamodel
Authors:
Amandine Marrel,
Bertrand Iooss,
Beatrice Laurent,
Olivier Roustant
Abstract:
Global sensitivity analysis of complex numerical models can be performed by calculating variance-based importance measures of the input variables, such as the Sobol indices. However, these techniques, requiring a large number of model evaluations, are often unacceptable for time expensive computer codes. A well known and widely used decision consists in replacing the computer code by a metamodel…
▽ More
Global sensitivity analysis of complex numerical models can be performed by calculating variance-based importance measures of the input variables, such as the Sobol indices. However, these techniques, requiring a large number of model evaluations, are often unacceptable for time expensive computer codes. A well known and widely used decision consists in replacing the computer code by a metamodel, predicting the model responses with a negligible computation time and rending straightforward the estimation of Sobol indices. In this paper, we discuss about the Gaussian process model which gives analytical expressions of Sobol indices. Two approaches are studied to compute the Sobol indices: the first based on the predictor of the Gaussian process model and the second based on the global stochastic process model. Comparisons between the two estimates, made on analytical examples, show the superiority of the second approach in terms of convergence and robustness. Moreover, the second approach allows to integrate the modeling error of the Gaussian process model by directly giving some confidence intervals on the Sobol indices. These techniques are finally applied to a real case of hydrogeological modeling.
△ Less
Submitted 7 February, 2008;
originally announced February 2008.