Conformal Approach To Gaussian Process Surrogate Evaluation With Coverage Guarantees
Authors:
Edgar Jaber,
Vincent Blot,
Nicolas Brunel,
Vincent Chabridon,
Emmanuel Remy,
Bertrand Iooss,
Didier Lucor,
Mathilde Mougeot,
Alessandro Leite
Abstract:
Gaussian processes (GPs) are a Bayesian machine learning approach widely used to construct surrogate models for the uncertainty quantification of computer simulation codes in industrial applications. It provides both a mean predictor and an estimate of the posterior prediction variance, the latter being used to produce Bayesian credibility intervals. Interpreting these intervals relies on the Gaus…
▽ More
Gaussian processes (GPs) are a Bayesian machine learning approach widely used to construct surrogate models for the uncertainty quantification of computer simulation codes in industrial applications. It provides both a mean predictor and an estimate of the posterior prediction variance, the latter being used to produce Bayesian credibility intervals. Interpreting these intervals relies on the Gaussianity of the simulation model as well as the well-specification of the priors which are not always appropriate. We propose to address this issue with the help of conformal prediction. In the present work, a method for building adaptive cross-conformal prediction intervals is proposed by weighting the non-conformity score with the posterior standard deviation of the GP. The resulting conformal prediction intervals exhibit a level of adaptivity akin to Bayesian credibility sets and display a significant correlation with the surrogate model local approximation error, while being free from the underlying model assumptions and having frequentist coverage guarantees. These estimators can thus be used for evaluating the quality of a GP surrogate model and can assist a decision-maker in the choice of the best prior for the specific application of the GP. The performance of the method is illustrated through a panel of numerical examples based on various reference databases. Moreover, the potential applicability of the method is demonstrated in the context of surrogate modeling of an expensive-to-evaluate simulator of the clogging phenomenon in steam generators of nuclear reactors.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
The ICSCREAM methodology: Identification of penalizing configurations in computer experiments using screening and metamodel -- Applications in thermal-hydraulics
Authors:
A. Marrel,
Bertrand Iooss,
V Chabridon
Abstract:
In the framework of risk assessment in nuclear accident analysis, best-estimatecomputer codes, associated to a probabilistic modeling of the uncertain input variables,are used to estimate safety margins. A first step in such uncertainty quantificationstudies is often to identify the critical configurations (or penalizing, in thesense of a prescribed safety margin) of several input parameters (call…
▽ More
In the framework of risk assessment in nuclear accident analysis, best-estimatecomputer codes, associated to a probabilistic modeling of the uncertain input variables,are used to estimate safety margins. A first step in such uncertainty quantificationstudies is often to identify the critical configurations (or penalizing, in thesense of a prescribed safety margin) of several input parameters (called ``scenarioinputs''), under the uncertainty on the other input parameters. However, the largeCPU-time cost of most of the computer codes used in nuclear engineering, as theones related to thermal-hydraulic accident scenario simulations, involve to develophighly efficient strategies. This work focuses on machine learning algorithms bythe way of the metamodel-based approach (i.e., a mathematical model which is fittedon a small-size sample of simulations). To achieve it with a very large numberof inputs, a specific and original methodology, called ICSCREAM (Identificationof penalizing Configurations using SCREening And Metamodel), is proposed. Thescreening of influential inputs is based on an advanced global sensitivity analysistool (HSIC importance measures). A Gaussian process metamodel is then sequentiallybuilt and used to estimate, within a Bayesian framework, the conditionalprobabilities of exceeding a high-level threshold, according to the scenario inputs.The efficiency of this methodology is illustrated on two high-dimensional (arounda hundred inputs) thermal-hydraulic industrial cases simulating an accident of primarycoolant loss in a pressurized water reactor. For both use cases, the studyfocuses on the peak cladding temperature (PCT) and critical configurations aredefined by exceeding the 90%-quantile of PCT. In both cases, the ICSCREAMmethodology allows to estimate, by using only around one thousand of code simulations,the impact of the scenario inputs and their critical areas of values.
△ Less
Submitted 27 August, 2021; v1 submitted 8 April, 2020;
originally announced April 2020.