Search | arXiv e-print repository

Optimal Experimental Design Criteria for Data-Consistent Inversion

Authors: Troy Butler, John Jakeman, Michael Pilosov, Scott Walsh, Timothy Wildey

Abstract: The ability to design effective experiments is crucial for obtaining data that can substantially reduce the uncertainty in the predictions made using computational models. An optimal experimental design (OED) refers to the choice of a particular experiment that optimizes a particular design criteria, e.g., maximizing a utility function, which measures the information content of the data. However,… ▽ More The ability to design effective experiments is crucial for obtaining data that can substantially reduce the uncertainty in the predictions made using computational models. An optimal experimental design (OED) refers to the choice of a particular experiment that optimizes a particular design criteria, e.g., maximizing a utility function, which measures the information content of the data. However, traditional approaches for optimal experimental design typically require solving a large number of computationally intensive inverse problems to find the data that maximizes the utility function. Here, we introduce two novel OED criteria that are specifically crafted for the data consistent inversion (DCI) framework, but do not require solving inverse problems. DCI is a specific approach for solving a class of stochastic inverse problems by constructing a pullback measure on uncertain parameters from an observed probability measure on the outputs of a quantity of interest (QoI) map. While expected information gain (EIG) has been used for both DCI and Bayesian based OED, the characteristics and properties of DCI solutions differ from those of solutions to Bayesian inverse problems which should be reflected in the OED criteria. The new design criteria developed in this study, called the expected scaling effect and the expected skewness effect, leverage the geometric structure of pre-images associated with observable data sets, allowing for an intuitive and computationally efficient approach to OED. These criteria utilize singular value computations derived from sampled and approximated Jacobians of the experimental designs. We present both simultaneous and sequential (greedy) formulations of OED based on these innovative criteria. Numerical results demonstrate the effectiveness in our approach for solving stochastic inverse problems. △ Less

Submitted 13 June, 2025; originally announced June 2025.

arXiv:2505.09828 [pdf, ps, other]

Optimally balancing exploration and exploitation to automate multi-fidelity statistical estimation

Authors: Thomas Dixon, Alex Gorodetsky, John Jakeman, Akil Narayan, Yiming Xu

Abstract: Multi-fidelity methods that use an ensemble of models to compute a Monte Carlo estimator of the expectation of a high-fidelity model can significantly reduce computational costs compared to single-model approaches. These methods use oracle statistics, specifically the covariance between models, to optimally allocate samples to each model in the ensemble. However, in practice, the oracle statistics… ▽ More Multi-fidelity methods that use an ensemble of models to compute a Monte Carlo estimator of the expectation of a high-fidelity model can significantly reduce computational costs compared to single-model approaches. These methods use oracle statistics, specifically the covariance between models, to optimally allocate samples to each model in the ensemble. However, in practice, the oracle statistics are estimated using additional model evaluations, whose computational cost and induced error are typically ignored. To address this issue, this paper proposes an adaptive algorithm to optimally balance the resources between oracle statistics estimation and final multi-fidelity estimator construction, leveraging ideas from multilevel best linear unbiased estimators in Schaden and Ullmann (2020) and a bandit-learning procedure in Xu et al. (2022). Under mild assumptions, we demonstrate that the multi-fidelity estimator produced by the proposed algorithm exhibits mean-squared error commensurate with that of the best linear unbiased estimator under the optimal allocation computed with oracle statistics. Our theoretical findings are supported by detailed numerical experiments, including a parametric elliptic PDE and an ice-sheet mass-change modeling problem. △ Less

Submitted 14 May, 2025; originally announced May 2025.

Comments: 37 pages

arXiv:2502.15496 [pdf, other]

Verification and Validation for Trustworthy Scientific Machine Learning

Authors: John D. Jakeman, Lorena A. Barba, Joaquim R. R. A. Martins, Thomas O'Leary-Roseberry

Abstract: Scientific machine learning (SciML) models are transforming many scientific disciplines. However, the development of good modeling practices to increase the trustworthiness of SciML has lagged behind its application, limiting its potential impact. The goal of this paper is to start a discussion on establishing consensus-based good practices for predictive SciML. We identify key challenges in apply… ▽ More Scientific machine learning (SciML) models are transforming many scientific disciplines. However, the development of good modeling practices to increase the trustworthiness of SciML has lagged behind its application, limiting its potential impact. The goal of this paper is to start a discussion on establishing consensus-based good practices for predictive SciML. We identify key challenges in applying existing computational science and engineering guidelines, such as verification and validation protocols, and provide recommendations to address these challenges. Our discussion focuses on predictive SciML, which uses machine learning models to learn, improve, and accelerate numerical simulations of physical systems. While centered on predictive applications, our 16 recommendations aim to help researchers conduct and document their modeling processes rigorously across all SciML domains. △ Less

Submitted 25 April, 2025; v1 submitted 21 February, 2025; originally announced February 2025.

Report number: SAND2025-01935O MSC Class: 68T07; 68N30 ACM Class: I.6.4; I.6.5; G.4

arXiv:2412.06601 [pdf, other]

A switching Kalman filter approach to online mitigation and correction of sensor corruption for inertial navigation

Authors: Artem Mustaev, Nicholas Galioto, Matt Boler, John D. Jakeman, Cosmin Safta, Alex Gorodetsky

Abstract: This paper introduces a novel approach to detect and address faulty or corrupted external sensors in the context of inertial navigation by leveraging a switching Kalman Filter combined with parameter augmentation. Instead of discarding the corrupted data, the proposed method retains and processes it, running multiple observation models simultaneously and evaluating their likelihoods to accurately… ▽ More This paper introduces a novel approach to detect and address faulty or corrupted external sensors in the context of inertial navigation by leveraging a switching Kalman Filter combined with parameter augmentation. Instead of discarding the corrupted data, the proposed method retains and processes it, running multiple observation models simultaneously and evaluating their likelihoods to accurately identify the true state of the system. We demonstrate the effectiveness of this approach to both identify the moment that a sensor becomes faulty and to correct for the resulting sensor behavior to maintain accurate estimates. We demonstrate our approach on an application of balloon navigation in the atmosphere and shuttle reentry. The results show that our method can accurately recover the true system state even in the presence of significant sensor bias, thereby improving the robustness and reliability of state estimation systems under challenging conditions. We also provide a statistical analysis of problem settings to determine when and where our method is most accurate and where it fails. △ Less

Submitted 10 December, 2024; v1 submitted 9 December, 2024; originally announced December 2024.

arXiv:2407.13814 [pdf, ps, other]

Building Population-Informed Priors for Bayesian Inference Using Data-Consistent Stochastic Inversion

Authors: Rebekah D. White, John D. Jakeman, Tim Wildey, Troy Butler

Abstract: Bayesian inference provides a powerful tool for leveraging observational data to inform model predictions and uncertainties. However, when such data is limited, Bayesian inference may not adequately constrain uncertainty without the use of highly informative priors. Common approaches for constructing informative priors typically rely on either assumptions or knowledge of the underlying physics, wh… ▽ More Bayesian inference provides a powerful tool for leveraging observational data to inform model predictions and uncertainties. However, when such data is limited, Bayesian inference may not adequately constrain uncertainty without the use of highly informative priors. Common approaches for constructing informative priors typically rely on either assumptions or knowledge of the underlying physics, which may not be available in all scenarios. In this work, we consider the scenario where data are available on a population of assets/individuals, which occurs in many problem domains such as biomedical or digital twin applications, and leverage this population-level data to systematically constrain the Bayesian prior and subsequently improve individualized inferences. The approach proposed in this paper is based upon a recently developed technique known as data-consistent inversion (DCI) for constructing a pullback probability measure. Succinctly, we utilize DCI to build population-informed priors for subsequent Bayesian inference on individuals. While the approach is general and applies to nonlinear maps and arbitrary priors, we prove that for linear inverse problems with Gaussian priors, the population-informed prior produces an increase in the information gain as measured by the determinant and trace of the inverse posterior covariance. We also demonstrate that the Kullback-Leibler divergence often improves with high probability. Numerical results, including linear-Gaussian examples and one inspired by digital twins for additively manufactured assets, indicate that there is significant value in using these population-informed priors. △ Less

Submitted 24 June, 2025; v1 submitted 18 July, 2024; originally announced July 2024.

Comments: Corrected error in Algorithm 1. Small changes to illustrative examples and introductory text

arXiv:2407.00809 [pdf, other]

Kernel Neural Operators (KNOs) for Scalable, Memory-efficient, Geometrically-flexible Operator Learning

Authors: Matthew Lowery, John Turnage, Zachary Morrow, John D. Jakeman, Akil Narayan, Shandian Zhe, Varun Shankar

Abstract: This paper introduces the Kernel Neural Operator (KNO), a novel operator learning technique that uses deep kernel-based integral operators in conjunction with quadrature for function-space approximation of operators (maps from functions to functions). KNOs use parameterized, closed-form, finitely-smooth, and compactly-supported kernels with trainable sparsity parameters within the integral operato… ▽ More This paper introduces the Kernel Neural Operator (KNO), a novel operator learning technique that uses deep kernel-based integral operators in conjunction with quadrature for function-space approximation of operators (maps from functions to functions). KNOs use parameterized, closed-form, finitely-smooth, and compactly-supported kernels with trainable sparsity parameters within the integral operators to significantly reduce the number of parameters that must be learned relative to existing neural operators. Moreover, the use of quadrature for numerical integration endows the KNO with geometric flexibility that enables operator learning on irregular geometries. Numerical results demonstrate that on existing benchmarks the training and test accuracy of KNOs is higher than popular operator learning techniques while using at least an order of magnitude fewer trainable parameters. KNOs thus represent a new paradigm of low-memory, geometrically-flexible, deep operator learning, while retaining the implementation simplicity and transparency of traditional kernel methods from both scientific computing and machine learning. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 10 pages + 5 page appendix, 8 figures

arXiv:2402.14736 [pdf, other]

Grouped approximate control variate estimators

Authors: Alex A. Gorodetsky, John D. Jakeman, Michael S. Eldred

Abstract: This paper analyzes the approximate control variate (ACV) approach to multifidelity uncertainty quantification in the case where weighted estimators are combined to form the components of the ACV. The weighted estimators enable one to precisely group models that share input samples to achieve improved variance reduction. We demonstrate that this viewpoint yields a generalized linear estimator that… ▽ More This paper analyzes the approximate control variate (ACV) approach to multifidelity uncertainty quantification in the case where weighted estimators are combined to form the components of the ACV. The weighted estimators enable one to precisely group models that share input samples to achieve improved variance reduction. We demonstrate that this viewpoint yields a generalized linear estimator that can assign any weight to any sample. This generalization shows that other linear estimators in the literature, particularly the multilevel best linear unbiased estimator (ML-BLUE) of Schaden and Ullman in 2020, becomes a specific version of the ACV estimator of Gorodetsky, Geraci, Jakeman, and Eldred, 2020. Moreover, this connection enables numerous extensions and insights. For example, we empirically show that having non-independent groups can yield better variance reduction compared to the independent groups used by ML-BLUE. Furthermore, we show that such grouped estimators can use arbitrary weighted estimators, not just the simple Monte Carlo estimators used in ML-BLUE. Furthermore, the analysis enables the derivation of ML-BLUE directly from a variance reduction perspective, rather than a regression perspective. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: 17 pages, 3 figures

arXiv:2402.13768 [pdf, other]

Democratizing Uncertainty Quantification

Authors: Linus Seelinger, Anne Reinarz, Mikkel B. Lykkegaard, Robert Akers, Amal M. A. Alghamdi, David Aristoff, Wolfgang Bangerth, Jean Bénézech, Matteo Diez, Kurt Frey, John D. Jakeman, Jakob S. Jørgensen, Ki-Tae Kim, Benjamin M. Kent, Massimiliano Martinelli, Matthew Parno, Riccardo Pellegrini, Noemi Petra, Nicolai A. B. Riis, Katherine Rosenfeld, Andrea Serani, Lorenzo Tamellini, Umberto Villa, Tim J. Dodwell, Robert Scheichl

Abstract: Uncertainty Quantification (UQ) is vital to safety-critical model-based analyses, but the widespread adoption of sophisticated UQ methods is limited by technical complexity. In this paper, we introduce UM-Bridge (the UQ and Modeling Bridge), a high-level abstraction and software protocol that facilitates universal interoperability of UQ software with simulation codes. It breaks down the technical… ▽ More Uncertainty Quantification (UQ) is vital to safety-critical model-based analyses, but the widespread adoption of sophisticated UQ methods is limited by technical complexity. In this paper, we introduce UM-Bridge (the UQ and Modeling Bridge), a high-level abstraction and software protocol that facilitates universal interoperability of UQ software with simulation codes. It breaks down the technical complexity of advanced UQ applications and enables separation of concerns between experts. UM-Bridge democratizes UQ by allowing effective interdisciplinary collaboration, accelerating the development of advanced UQ methods, and making it easy to perform UQ analyses from prototype to High Performance Computing (HPC) scale. In addition, we present a library of ready-to-run UQ benchmark problems, all easily accessible through UM-Bridge. These benchmarks support UQ methodology research, enabling reproducible performance comparisons. We demonstrate UM-Bridge with several scientific applications, harnessing HPC resources even using UQ codes not designed with HPC support. △ Less

Submitted 9 September, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: Add Benjamin Kent as co-author in accordance with the paper's published version

arXiv:2304.08644 [pdf, other]

doi 10.1016/j.cma.2023.116205

Multifidelity uncertainty quantification with models based on dissimilar parameters

Authors: Xiaoshu Zeng, Gianluca Geraci, Michael S. Eldred, John D. Jakeman, Alex A. Gorodetsky, Roger Ghanem

Abstract: Multifidelity uncertainty quantification (MF UQ) sampling approaches have been shown to significantly reduce the variance of statistical estimators while preserving the bias of the highest-fidelity model, provided that the low-fidelity models are well correlated. However, maintaining a high level of correlation can be challenging, especially when models depend on different input uncertain paramete… ▽ More Multifidelity uncertainty quantification (MF UQ) sampling approaches have been shown to significantly reduce the variance of statistical estimators while preserving the bias of the highest-fidelity model, provided that the low-fidelity models are well correlated. However, maintaining a high level of correlation can be challenging, especially when models depend on different input uncertain parameters, which drastically reduces the correlation. Existing MF UQ approaches do not adequately address this issue. In this work, we propose a new sampling strategy that exploits a shared space to improve the correlation among models with dissimilar parametrization. We achieve this by transforming the original coordinates onto an auxiliary manifold using the adaptive basis (AB) method~\cite{Tipireddy2014}. The AB method has two main benefits: (1) it provides an effective tool to identify the low-dimensional manifold on which each model can be represented, and (2) it enables easy transformation of polynomial chaos representations from high- to low-dimensional spaces. This latter feature is used to identify a shared manifold among models without requiring additional evaluations. We present two algorithmic flavors of the new estimator to cover different analysis scenarios, including those with legacy and non-legacy high-fidelity data. We provide numerical results for analytical examples, a direct field acoustic test, and a finite element model of a nuclear fuel assembly. For all examples, we compare the proposed strategy against both single-fidelity and MF estimators based on the original model parametrization. △ Less

Submitted 17 April, 2023; originally announced April 2023.

arXiv:2212.12386 [pdf, ps, other]

Hyper-differential sensitivity analysis in the context of Bayesian inference applied to ice-sheet problems

Authors: William Reese, Joseph Hart, Bart van Bloemen Waanders, Mauro Perego, John Jakeman, Arvind Saibaba

Abstract: Inverse problems constrained by partial differential equations (PDEs) play a critical role in model development and calibration. In many applications, there are multiple uncertain parameters in a model which must be estimated. Although the Bayesian formulation is attractive for such problems, computational cost and high dimensionality frequently prohibit a thorough exploration of the parametric un… ▽ More Inverse problems constrained by partial differential equations (PDEs) play a critical role in model development and calibration. In many applications, there are multiple uncertain parameters in a model which must be estimated. Although the Bayesian formulation is attractive for such problems, computational cost and high dimensionality frequently prohibit a thorough exploration of the parametric uncertainty. A common approach is to reduce the dimension by fixing some parameters (which we will call auxiliary parameters) to a best estimate and use techniques from PDE-constrained optimization to approximate properties of the Bayesian posterior distribution. For instance, the maximum a posteriori probability (MAP) and the Laplace approximation of the posterior covariance can be computed. In this article, we propose using hyper-differential sensitivity analysis (HDSA) to assess the sensitivity of the MAP point to changes in the auxiliary parameters. We establish an interpretation of HDSA as correlations in the posterior distribution. Our proposed framework is demonstrated on the inversion of bedrock topography for the Greenland ice sheet with uncertainties arising from the basal friction coefficient and climate forcing (ice accumulation rate) △ Less

Submitted 23 December, 2022; originally announced December 2022.

arXiv:2104.11079 [pdf, other]

doi 10.2172/1807223

Randomized Algorithms for Scientific Computing (RASC)

Authors: Aydin Buluc, Tamara G. Kolda, Stefan M. Wild, Mihai Anitescu, Anthony DeGennaro, John Jakeman, Chandrika Kamath, Ramakrishnan Kannan, Miles E. Lopes, Per-Gunnar Martinsson, Kary Myers, Jelani Nelson, Juan M. Restrepo, C. Seshadhri, Draguna Vrabie, Brendt Wohlberg, Stephen J. Wright, Chao Yang, Peter Zwart

Abstract: Randomized algorithms have propelled advances in artificial intelligence and represent a foundational research area in advancing AI for Science. Future advancements in DOE Office of Science priority areas such as climate science, astrophysics, fusion, advanced materials, combustion, and quantum computing all require randomized algorithms for surmounting challenges of complexity, robustness, and sc… ▽ More Randomized algorithms have propelled advances in artificial intelligence and represent a foundational research area in advancing AI for Science. Future advancements in DOE Office of Science priority areas such as climate science, astrophysics, fusion, advanced materials, combustion, and quantum computing all require randomized algorithms for surmounting challenges of complexity, robustness, and scalability. This report summarizes the outcomes of that workshop, "Randomized Algorithms for Scientific Computing (RASC)," held virtually across four days in December 2020 and January 2021. △ Less

Submitted 21 March, 2022; v1 submitted 19 April, 2021; originally announced April 2021.

arXiv:2008.02672 [pdf, other]

MFNets: Data efficient all-at-once learning of multifidelity surrogates as directed networks of information sources

Authors: Alex Gorodetsky, John D. Jakeman, Gianluca Geraci

Abstract: We present an approach for constructing a surrogate from ensembles of information sources of varying cost and accuracy. The multifidelity surrogate encodes connections between information sources as a directed acyclic graph, and is trained via gradient-based minimization of a nonlinear least squares objective. While the vast majority of state-of-the-art assumes hierarchical connections between inf… ▽ More We present an approach for constructing a surrogate from ensembles of information sources of varying cost and accuracy. The multifidelity surrogate encodes connections between information sources as a directed acyclic graph, and is trained via gradient-based minimization of a nonlinear least squares objective. While the vast majority of state-of-the-art assumes hierarchical connections between information sources, our approach works with flexibly structured information sources that may not admit a strict hierarchy. The formulation has two advantages: (1) increased data efficiency due to parsimonious multifidelity networks that can be tailored to the application; and (2) no constraints on the training data -- we can combine noisy, non-nested evaluations of the information sources. Numerical examples ranging from synthetic to physics-based computational mechanics simulations indicate the error in our approach can be orders-of-magnitude smaller, particularly in the low-data regime, than single-fidelity and hierarchical multifidelity approaches. △ Less

Submitted 23 August, 2021; v1 submitted 3 August, 2020; originally announced August 2020.

Comments: 24 pages

MSC Class: 62J02; 65D15; 41A10

arXiv:2006.09319 [pdf, other]

doi 10.1615/JMachLearnModelComput.2020035155

A Survey of Constrained Gaussian Process Regression: Approaches and Implementation Challenges

Authors: Laura Swiler, Mamikon Gulian, Ari Frankel, Cosmin Safta, John Jakeman

Abstract: Gaussian process regression is a popular Bayesian framework for surrogate modeling of expensive data sources. As part of a broader effort in scientific machine learning, many recent works have incorporated physical constraints or other a priori information within Gaussian process regression to supplement limited data and regularize the behavior of the model. We provide an overview and survey of se… ▽ More Gaussian process regression is a popular Bayesian framework for surrogate modeling of expensive data sources. As part of a broader effort in scientific machine learning, many recent works have incorporated physical constraints or other a priori information within Gaussian process regression to supplement limited data and regularize the behavior of the model. We provide an overview and survey of several classes of Gaussian process constraints, including positivity or bound constraints, monotonicity and convexity constraints, differential equation constraints provided by linear PDEs, and boundary condition constraints. We compare the strategies behind each approach as well as the differences in implementation, concluding with a discussion of the computational challenges introduced by constraints. △ Less

Submitted 6 January, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

Comments: 42 pages, 3 figures. Version 3: DOI & Reference added; appeared in Journal of Machine Learning for Modeling and Computing. Version 2 includes minor additions, clarifications and improvements to notation

Journal ref: Journal of Machine Learning for Modeling and Computing, 1(2):119-156 (2020)

arXiv:2006.02392 [pdf, other]

Data-driven learning of non-autonomous systems

Authors: Tong Qin, Zhen Chen, John Jakeman, Dongbin Xiu

Abstract: We present a numerical framework for recovering unknown non-autonomous dynamical systems with time-dependent inputs. To circumvent the difficulty presented by the non-autonomous nature of the system, our method transforms the solution state into piecewise integration of the system over a discrete set of time instances. The time-dependent inputs are then locally parameterized by using a proper mode… ▽ More We present a numerical framework for recovering unknown non-autonomous dynamical systems with time-dependent inputs. To circumvent the difficulty presented by the non-autonomous nature of the system, our method transforms the solution state into piecewise integration of the system over a discrete set of time instances. The time-dependent inputs are then locally parameterized by using a proper model, for example, polynomial regression, in the pieces determined by the time instances. This transforms the original system into a piecewise parametric system that is locally time invariant. We then design a deep neural network structure to learn the local models. Once the network model is constructed, it can be iteratively used over time to conduct global system prediction. We provide theoretical analysis of our algorithm and present a number of numerical examples to demonstrate the effectiveness of the method. △ Less

Submitted 2 June, 2020; originally announced June 2020.

arXiv:1910.07096 [pdf, other]

Deep learning of parameterized equations with applications to uncertainty quantification

Authors: Tong Qin, Zhen Chen, John Jakeman, Dongbin Xiu

Abstract: We propose a numerical method for discovering unknown parameterized dynamical systems by using observational data of the state variables. Our method is built upon and extends the recent work of discovering unknown dynamical systems, in particular those using deep neural network (DNN). We propose a DNN structure, largely based upon the residual network (ResNet), to not only learn the unknown form o… ▽ More We propose a numerical method for discovering unknown parameterized dynamical systems by using observational data of the state variables. Our method is built upon and extends the recent work of discovering unknown dynamical systems, in particular those using deep neural network (DNN). We propose a DNN structure, largely based upon the residual network (ResNet), to not only learn the unknown form of the governing equation but also take into account the random effect embedded in the system, which is generated by the random parameters. Once the DNN model is successfully constructed, it is able to produce system prediction over longer term and for arbitrary parameter values. For uncertainty quantification, it allows us to conduct uncertainty analysis by evaluating solution statistics over the parameter space. △ Less

Submitted 10 March, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

arXiv:1909.13845 [pdf, other]

doi 10.1002/nme.6268

Adaptive Multi-index Collocation for Uncertainty Quantification and Sensitivity Analysis

Authors: John D. Jakeman, Michael Eldred, Gianluca Geraci, Alex Gorodetsky

Abstract: In this paper, we present an adaptive algorithm to construct response surface approximations of high-fidelity models using a hierarchy of lower fidelity models. Our algorithm is based on multi-index stochastic collocation and automatically balances physical discretization error and response surface error to construct an approximation of model outputs. This surrogate can be used for uncertainty qua… ▽ More In this paper, we present an adaptive algorithm to construct response surface approximations of high-fidelity models using a hierarchy of lower fidelity models. Our algorithm is based on multi-index stochastic collocation and automatically balances physical discretization error and response surface error to construct an approximation of model outputs. This surrogate can be used for uncertainty quantification (UQ) and sensitivity analysis (SA) at a fraction of the cost of a purely high-fidelity approach. We demonstrate the effectiveness of our algorithm on a canonical test problem from the UQ literature and a complex multi-physics model that simulates the performance of an integrated nozzle for an unmanned aerospace vehicle. We find that, when the input-output response is sufficiently smooth, our algorithm produces approximations that can be over two orders of magnitude more accurate than single fidelity approximations for a fixed computational budget. △ Less

Submitted 30 September, 2019; originally announced September 2019.

Comments: 32 pages, 16 figures

MSC Class: 65C05; 65C20; 65C50; 65C60

arXiv:1903.09682 [pdf, other]

doi 10.1016/j.cma.2019.03.049

Polynomial chaos expansions for dependent random variables

Authors: John Jakeman, Fabian Franzelin, Akil Narayan, Michael Eldred, Dirk Plfueger

Abstract: Polynomial chaos expansions (PCE) are well-suited to quantifying uncertainty in models parameterized by independent random variables. The assumption of independence leads to simple strategies for evaluating PCE coefficients. In contrast, the application of PCE to models of dependent variables is much more challenging. Three approaches can be used. The first approach uses mapping methods where meas… ▽ More Polynomial chaos expansions (PCE) are well-suited to quantifying uncertainty in models parameterized by independent random variables. The assumption of independence leads to simple strategies for evaluating PCE coefficients. In contrast, the application of PCE to models of dependent variables is much more challenging. Three approaches can be used. The first approach uses mapping methods where measure transformations, such as the Nataf and Rosenblatt transformation, can be used to map dependent random variables to independent ones; however we show that this can significantly degrade performance since the Jacobian of the map must be approximated. A second strategy is the class of dominating support methods which build PCE using independent random variables whose distributional support dominates the support of the true dependent joint density; we provide evidence that this approach appears to produce approximations with suboptimal accuracy. A third approach, the novel method proposed here, uses Gram-Schmidt orthogonalization (GSO) to numerically compute orthonormal polynomials for the dependent random variables. This approach has been used successfully when solving differential equations using the intrusive stochastic Galerkin method, and in this paper we use GSO to build PCE using a non-intrusive stochastic collocation method. The stochastic collocation method treats the model as a black box and builds approximations of model output from a set of samples. Building PCE from samples can introduce ill-conditioning which does not plague stochastic Galerkin methods. To mitigate this ill-conditioning we generate weighted Leja sequences, which are nested sample sets, to build accurate polynomial interpolants. We show that our proposed approach produces PCE which are orders of magnitude more accurate than PCE constructed using mapping or dominating support methods. △ Less

Submitted 22 March, 2019; originally announced March 2019.

Comments: 27 pages, 10 figures

MSC Class: 65C05; 65D99

arXiv:1811.04988 [pdf, other]

doi 10.1016/j.jcp.2020.109257

A Generalized Approximate Control Variate Framework for Multifidelity Uncertainty Quantification

Authors: Alex A. Gorodetsky, Gianluca Geraci, Mike Eldred, John D. Jakeman

Abstract: We describe and analyze a variance reduction approach for Monte Carlo (MC) sampling that accelerates the estimation of statistics of computationally expensive simulation models using an ensemble of models with lower cost. These lower cost models --- which are typically lower fidelity with unknown statistics --- are used to reduce the variance in statistical estimators relative to a MC estimator wi… ▽ More We describe and analyze a variance reduction approach for Monte Carlo (MC) sampling that accelerates the estimation of statistics of computationally expensive simulation models using an ensemble of models with lower cost. These lower cost models --- which are typically lower fidelity with unknown statistics --- are used to reduce the variance in statistical estimators relative to a MC estimator with equivalent cost. We derive the conditions under which our proposed approximate control variate framework recovers existing multi-model variance reduction schemes as special cases. We demonstrate that these existing strategies use recursive sampling strategies, and as a result, their maximum possible variance reduction is limited to that of a control variate algorithm that uses only a single low-fidelity model with known mean. This theoretical result holds regardless of the number of low-fidelity models and/or samples used to build the estimator. We then derive new sampling strategies within our framework that circumvent this limitation to make efficient use of all available information sources. In particular, we demonstrate that a significant gap can exist, of orders of magnitude in some cases, between the variance reduction achievable by using a single low-fidelity model and our non-recursive approach. We also present initial sample allocation approaches for exploiting this gap. They yield the greatest benefit when augmenting the high-fidelity model evaluations is impractical because, for instance, they arise from a legacy database. Several analytic examples and an example with a hyperbolic PDE describing elastic wave propagation in heterogeneous media are used to illustrate the main features of the methodology. △ Less

Submitted 6 June, 2019; v1 submitted 12 November, 2018; originally announced November 2018.

arXiv:1809.00434 [pdf, ps, other]

doi 10.1615/Int.J.UncertaintyQuantification.2018026902

Time and Frequency Domain Methods for Basis Selection in Random Linear Dynamical Systems

Authors: John D. Jakeman, Roland Pulch

Abstract: Polynomial chaos methods have been extensively used to analyze systems in uncertainty quantification. Furthermore, several approaches exist to determine a low-dimensional approximation (or sparse approximation) for some quantity of interest in a model, where just a few orthogonal basis polynomials are required. We consider linear dynamical systems consisting of ordinary differential equations with… ▽ More Polynomial chaos methods have been extensively used to analyze systems in uncertainty quantification. Furthermore, several approaches exist to determine a low-dimensional approximation (or sparse approximation) for some quantity of interest in a model, where just a few orthogonal basis polynomials are required. We consider linear dynamical systems consisting of ordinary differential equations with random variables. The aim of this paper is to explore methods for producing low-dimensional approximations of the quantity of interest further. We investigate two numerical techniques to compute a low-dimensional representation, which both fit the approximation to a set of samples in the time domain. On the one hand, a frequency domain analysis of a stochastic Galerkin system yields the selection of the basis polynomials. It follows a linear least squares problem. On the other hand, a sparse minimization yields the choice of the basis polynomials by information from the time domain only. An orthogonal matching pursuit produces an approximate solution of the minimization problem. We compare the two approaches using a test example from a mechanical application. △ Less

Submitted 2 September, 2018; originally announced September 2018.

MSC Class: 42C05; 41A10; 41A63

arXiv:1807.00375 [pdf, other]

doi 10.1137/18M1181675

Convergence of Probability Densities using Approximate Models for Forward and Inverse Problems in Uncertainty Quantification

Authors: T. Butler, J. D. Jakeman, T. Wildey

Abstract: We analyze the convergence of probability density functions utilizing approximate models for both forward and inverse problems. We consider the standard forward uncertainty quantification problem where an assumed probability density on parameters is propagated through the approximate model to produce a probability density, often called a push-forward probability density, on a set of quantities of… ▽ More We analyze the convergence of probability density functions utilizing approximate models for both forward and inverse problems. We consider the standard forward uncertainty quantification problem where an assumed probability density on parameters is propagated through the approximate model to produce a probability density, often called a push-forward probability density, on a set of quantities of interest (QoI). The inverse problem considered in this paper seeks a posterior probability density on model input parameters such that the subsequent push-forward density through the parameter-to-QoI map matches a given probability density on the QoI. We prove that the probability densities obtained from solving the forward and inverse problems, using approximate models, converge to the true probability densities as the approximate models converges to the true models. Numerical results are presented to demonstrate optimal convergence of probability densities for sparse grid approximations of parameter-to-QoI maps and standard spatial and temporal discretizations of PDEs and ODEs. △ Less

Submitted 1 July, 2018; originally announced July 2018.

MSC Class: 60H30; 60H35; 60B10

arXiv:1801.00885 [pdf, other]

doi 10.1016/j.jcp.2018.08.010

Gradient-based Optimization for Regression in the Functional Tensor-Train Format

Authors: Alex A. Gorodetsky, John D. Jakeman

Abstract: We consider the task of low-multilinear-rank functional regression, i.e., learning a low-rank parametric representation of functions from scattered real-valued data. Our first contribution is the development and analysis of an efficient gradient computation that enables gradient-based optimization procedures, including stochastic gradient descent and quasi-Newton methods, for learning the paramete… ▽ More We consider the task of low-multilinear-rank functional regression, i.e., learning a low-rank parametric representation of functions from scattered real-valued data. Our first contribution is the development and analysis of an efficient gradient computation that enables gradient-based optimization procedures, including stochastic gradient descent and quasi-Newton methods, for learning the parameters of a functional tensor-train (FT). The functional tensor-train uses the tensor-train (TT) representation of low-rank arrays as an ansatz for a class of low-multilinear-rank functions. The FT is represented by a set of matrix-valued functions that contain a set of univariate functions, and the regression task is to learn the parameters of these univariate functions. Our second contribution demonstrates that using nonlinearly parameterized univariate functions, e.g., symmetric kernels with moving centers, within each core can outperform the standard approach of using a linear expansion of basis functions. Our final contributions are new rank adaptation and group-sparsity regularization procedures to minimize overfitting. We use several benchmark problems to demonstrate at least an order of magnitude lower accuracy with gradient-based optimization methods than standard alternating least squares procedures in the low-sample number regime. We also demonstrate an order of magnitude reduction in accuracy on a test problem resulting from using nonlinear parameterizations over linear parameterizations. Finally we compare regression performance with 22 other nonparametric and parametric regression methods on 10 real-world data sets. We achieve top-five accuracy for seven of the data sets and best accuracy for two of the data sets. These rankings are the best amongst parametric models and competetive with the best non-parametric methods. △ Less

Submitted 10 January, 2018; v1 submitted 2 January, 2018; originally announced January 2018.

Comments: 24 pages

arXiv:1711.00506 [pdf, other]

doi 10.1016/j.cma.2018.04.009

Generation and application of multivariate polynomial quadrature rules

Authors: John D. Jakeman, Akil Narayan

Abstract: The search for multivariate quadrature rules of minimal size with a specified polynomial accuracy has been the topic of many years of research. Finding such a rule allows accurate integration of moments, which play a central role in many aspects of scientific computing with complex models. The contribution of this paper is twofold. First, we provide novel mathematical analysis of the polynomial qu… ▽ More The search for multivariate quadrature rules of minimal size with a specified polynomial accuracy has been the topic of many years of research. Finding such a rule allows accurate integration of moments, which play a central role in many aspects of scientific computing with complex models. The contribution of this paper is twofold. First, we provide novel mathematical analysis of the polynomial quadrature problem that provides a lower bound for the minimal possible number of nodes in a polynomial rule with specified accuracy. We give concrete but simplistic multivariate examples where a minimal quadrature rule can be designed that achieves this lower bound, along with situations that showcase when it is not possible to achieve this lower bound. Our second main contribution comes in the formulation of an algorithm that is able to efficiently generate multivariate quadrature rules with positive weights on non-tensorial domains. Our tests show success of this procedure in up to 20 dimensions. We test our method on applications to dimension reduction and chemical kinetics problems, including comparisons against popular alternatives such as sparse grids, Monte Carlo and quasi Monte Carlo sequences, and Stroud rules. The quadrature rules computed in this paper outperform these alternatives in almost all scenarios. △ Less

Submitted 1 November, 2017; originally announced November 2017.

MSC Class: 42C05; 41A55; 41A63

arXiv:1705.09395 [pdf, other]

doi 10.1115/1.4037457

Optimal Experimental Design Using A Consistent Bayesian Approach

Authors: Scott N. Walsh, Tim M. Wildey, John D. Jakeman

Abstract: We consider the utilization of a computational model to guide the optimal acquisition of experimental data to inform the stochastic description of model input parameters. Our formulation is based on the recently developed consistent Bayesian approach for solving stochastic inverse problems which seeks a posterior probability density that is consistent with the model and the data in the sense that… ▽ More We consider the utilization of a computational model to guide the optimal acquisition of experimental data to inform the stochastic description of model input parameters. Our formulation is based on the recently developed consistent Bayesian approach for solving stochastic inverse problems which seeks a posterior probability density that is consistent with the model and the data in the sense that the push-forward of the posterior (through the computational model) matches the observed density on the observations almost everywhere. Given a set a potential observations, our optimal experimental design (OED) seeks the observation, or set of observations, that maximizes the expected information gain from the prior probability density on the model parameters. We discuss the characterization of the space of observed densities and a computationally efficient approach for rescaling observed densities to satisfy the fundamental assumptions of the consistent Bayesian approach. Numerical results are presented to compare our approach with existing OED methodologies using the classical/statistical Bayesian approach and to demonstrate our OED on a set of representative PDE-based models. △ Less

Submitted 25 May, 2017; originally announced May 2017.

MSC Class: 60H30; 60H35; 60B10

arXiv:1704.00680 [pdf, other]

doi 10.1137/16M1087229

A Consistent Bayesian Formulation for Stochastic Inverse Problems Based on Push-forward Measures

Authors: T. Butler, J. D. Jakeman, T. Wildey

Abstract: We formulate, and present a numerical method for solving, an inverse problem for inferring parameters of a deterministic model from stochastic observational data (quantities of interest). The solution, given as a probability measure, is derived using a Bayesian updating approach for measurable maps that finds a posterior probability measure, that when propagated through the deterministic model pro… ▽ More We formulate, and present a numerical method for solving, an inverse problem for inferring parameters of a deterministic model from stochastic observational data (quantities of interest). The solution, given as a probability measure, is derived using a Bayesian updating approach for measurable maps that finds a posterior probability measure, that when propagated through the deterministic model produces a push-forward measure that exactly matches the observed probability measure on the data. Our approach for finding such posterior measures, which we call consistent Bayesian inference, is simple and only requires the computation of the push-forward probability measure induced by the combination of a prior probability measure and the deterministic model. We establish existence and uniqueness of observation-consistent posteriors and present stability and error analysis. We also discuss the relationships between consistent Bayesian inference, classical/statistical Bayesian inference, and a recently developed measure-theoretic approach for inference. Finally, analytical and numerical results are presented to highlight certain properties of the consistent Bayesian approach and the differences between this approach and the two aforementioned alternatives for inference. △ Less

Submitted 3 April, 2017; originally announced April 2017.

MSC Class: 60H30; 60H35; 60B10

arXiv:1703.00135 [pdf, other]

doi 10.1137/17M112590X

Compressed sensing with sparse corruptions: Fault-tolerant sparse collocation approximations

Authors: Ben Adcock, Anyi Bao, John D. Jakeman, Akil Narayan

Abstract: The recovery of approximately sparse or compressible coefficients in a Polynomial Chaos Expansion is a common goal in modern parametric uncertainty quantification (UQ). However, relatively little effort in UQ has been directed toward theoretical and computational strategies for addressing the sparse corruptions problem, where a small number of measurements are highly corrupted. Such a situation ha… ▽ More The recovery of approximately sparse or compressible coefficients in a Polynomial Chaos Expansion is a common goal in modern parametric uncertainty quantification (UQ). However, relatively little effort in UQ has been directed toward theoretical and computational strategies for addressing the sparse corruptions problem, where a small number of measurements are highly corrupted. Such a situation has become pertinent today since modern computational frameworks are sufficiently complex with many interdependent components that may introduce hardware and software failures, some of which can be difficult to detect and result in a highly polluted simulation result. In this paper we present a novel compressive sampling-based theoretical analysis for a regularized $\ell^1$ minimization algorithm that aims to recover sparse expansion coefficients in the presence of measurement corruptions. Our recovery results are uniform, and prescribe algorithmic regularization parameters in terms of a user-defined a priori estimate on the ratio of measurements that are believed to be corrupted. We also propose an iteratively reweighted optimization algorithm that automatically refines the value of the regularization parameter, and empirically produces superior results. Our numerical results test our framework on several medium-to-high dimensional examples of solutions to parameterized differential equations, and demonstrate the effectiveness of our approach. △ Less

Submitted 30 August, 2018; v1 submitted 28 February, 2017; originally announced March 2017.

Comments: 27 pages, 7 figures

arXiv:1602.06879 [pdf, other]

doi 10.1137/16M1063885

A generalized sampling and preconditioning scheme for sparse approximation of polynomial chaos expansions

Authors: John D. Jakeman, Akil Narayan, Tao Zhou

Abstract: In this paper we propose an algorithm for recovering sparse orthogonal polynomials using stochastic collocation. Our approach is motivated by the desire to use generalized polynomial chaos expansions (PCE) to quantify uncertainty in models subject to uncertain input parameters. The standard sampling approach for recovering sparse polynomials is to use Monte Carlo (MC) sampling of the density of or… ▽ More In this paper we propose an algorithm for recovering sparse orthogonal polynomials using stochastic collocation. Our approach is motivated by the desire to use generalized polynomial chaos expansions (PCE) to quantify uncertainty in models subject to uncertain input parameters. The standard sampling approach for recovering sparse polynomials is to use Monte Carlo (MC) sampling of the density of orthogonality. However MC methods result in poor function recovery when the polynomial degree is high. Here we propose a general algorithm that can be applied to any admissible weight function on a bounded domain and a wide class of exponential weight functions defined on unbounded domains. Our proposed algorithm samples with respect to the weighted equilibrium measure of the parametric domain, and subsequently solves a preconditioned $\ell^1$-minimization problem, where the weights of the diagonal preconditioning matrix are given by evaluations of the Christoffel function. We present theoretical analysis to motivate the algorithm, and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest. Numerical examples are also provided that demonstrate that our proposed Christoffel Sparse Approximation algorithm leads to comparable or improved accuracy even when compared with Legendre and Hermite specific algorithms. △ Less

Submitted 22 February, 2016; originally announced February 2016.

Comments: 32 pages, 10 figures

MSC Class: 42C05; 41A10; 41A63

arXiv:1412.4305 [pdf, other]

doi 10.1090/mcom/3192

A Christoffel function weighted least squares algorithm for collocation approximations

Authors: Akil Narayan, John D. Jakeman, Tao Zhou

Abstract: We propose, theoretically investigate, and numerically validate an algorithm for the Monte Carlo solution of least-squares polynomial approximation problems in a collocation frame- work. Our method is motivated by generalized Polynomial Chaos approximation in uncertainty quantification where a polynomial approximation is formed from a combination of orthogonal polynomials. A standard Monte Carlo a… ▽ More We propose, theoretically investigate, and numerically validate an algorithm for the Monte Carlo solution of least-squares polynomial approximation problems in a collocation frame- work. Our method is motivated by generalized Polynomial Chaos approximation in uncertainty quantification where a polynomial approximation is formed from a combination of orthogonal polynomials. A standard Monte Carlo approach would draw samples according to the density of orthogonality. Our proposed algorithm samples with respect to the equilibrium measure of the parametric domain, and subsequently solves a weighted least-squares problem, with weights given by evaluations of the Christoffel function. We present theoretical analysis to motivate the algorithm, and numerical results that show our method is superior to standard Monte Carlo methods in many situations of interest. △ Less

Submitted 29 January, 2016; v1 submitted 13 December, 2014; originally announced December 2014.

Comments: 29 pages, 11 figures

MSC Class: 65C05; 65D99

arXiv:1407.8093 [pdf, other]

doi 10.1016/j.jcp.2015.02.025

Enhancing $\ell_1$-minimization estimates of polynomial chaos expansions using basis selection

Authors: John D. Jakeman, Michael S. Eldred, Khachik Sargsyan

Abstract: In this paper we present a basis selection method that can be used with $\ell_1$-minimization to adaptively determine the large coefficients of polynomial chaos expansions (PCE). The adaptive construction produces anisotropic basis sets that have more terms in important dimensions and limits the number of unimportant terms that increase mutual coherence and thus degrade the performance of… ▽ More In this paper we present a basis selection method that can be used with $\ell_1$-minimization to adaptively determine the large coefficients of polynomial chaos expansions (PCE). The adaptive construction produces anisotropic basis sets that have more terms in important dimensions and limits the number of unimportant terms that increase mutual coherence and thus degrade the performance of $\ell_1$-minimization. The important features and the accuracy of basis selection are demonstrated with a number of numerical examples. Specifically, we show that for a given computational budget, basis selection produces a more accurate PCE than would be obtained if the basis is fixed a priori. We also demonstrate that basis selection can be applied with non-uniform random variables and can leverage gradient information. △ Less

Submitted 30 July, 2014; originally announced July 2014.

arXiv:1407.1061 [pdf, other]

doi 10.1016/j.jcp.2014.09.014

Enhancing adaptive sparse grid approximations and improving refinement strategies using adjoint-based a posteriori error estimates

Authors: John D. Jakeman, Timothy Wildey

Abstract: In this paper we present an algorithm for adaptive sparse grid approximations of quantities of interest computed from discretized partial differential equations. We use adjoint-based a posteriori error estimates of the physical discretization error and the interpolation error in the sparse grid to enhance the sparse grid approximation and to drive adaptivity of the sparse grid. Utilizing these err… ▽ More In this paper we present an algorithm for adaptive sparse grid approximations of quantities of interest computed from discretized partial differential equations. We use adjoint-based a posteriori error estimates of the physical discretization error and the interpolation error in the sparse grid to enhance the sparse grid approximation and to drive adaptivity of the sparse grid. Utilizing these error estimates provides significantly more accurate functional values for random samples of the sparse grid approximation. We also demonstrate that alternative refinement strategies based upon a posteriori error estimates can lead to further increases in accuracy in the approximation over traditional hierarchical surplus based strategies. Throughout this paper we also provide and test a framework for balancing the physical discretization error with the stochastic interpolation error of the enhanced sparse grid approximation. △ Less

Submitted 3 July, 2014; originally announced July 2014.

arXiv:1404.5663 [pdf, other]

doi 10.1137/140966368

Adaptive Leja sparse grid constructions for stochastic collocation and high-dimensional approximation

Authors: Akil Narayan, John Jakeman

Abstract: We propose an adaptive sparse grid stochastic collocation approach based upon Leja interpolation sequences for approximation of parameterized functions with high-dimensional parameters. Leja sequences are arbitrarily granular (any number of nodes may be added to a current sequence, producing a new sequence) and thus are a good choice for the univariate composite rule used to construct adaptive spa… ▽ More We propose an adaptive sparse grid stochastic collocation approach based upon Leja interpolation sequences for approximation of parameterized functions with high-dimensional parameters. Leja sequences are arbitrarily granular (any number of nodes may be added to a current sequence, producing a new sequence) and thus are a good choice for the univariate composite rule used to construct adaptive sparse grids in high dimensions. When undertaking stochastic collocation one is often interested in constructing weighted approximation where the weights are determined by the probability densities of the random variables. This paper establishes that a certain weighted formulation of one-dimensional Leja sequences produces a sequence of nodes whose empirical distribution converges to the corresponding limiting distribution of the Gauss quadrature nodes associated with the weight function. This property is true even for unbounded domains. We apply the Leja-sparse grid approach to several high-dimensional and problems and demonstrate that Leja sequences are often superior to more standard sparse grid constructions (e.g. Clenshaw-Curtis), at least for interpolatory metrics. △ Less

Submitted 25 September, 2014; v1 submitted 22 April, 2014; originally announced April 2014.

Comments: 29 pages, 11 figures

MSC Class: 65D05; 65D32; 31C20

arXiv:1110.0010 [pdf, other]

Local and Dimension Adaptive Sparse Grid Interpolation and Quadrature

Authors: John D. Jakeman, Stephen G. Roberts

Abstract: In this paper we present a locally and dimension-adaptive sparse grid method for interpolation and integration of high-dimensional functions with discontinuities. The proposed algorithm combines the strengths of the generalised sparse grid algorithm and hierarchical surplus-guided local adaptivity. A high-degree basis is used to obtain a high-order method which, given sufficient smoothness, perfor… ▽ More In this paper we present a locally and dimension-adaptive sparse grid method for interpolation and integration of high-dimensional functions with discontinuities. The proposed algorithm combines the strengths of the generalised sparse grid algorithm and hierarchical surplus-guided local adaptivity. A high-degree basis is used to obtain a high-order method which, given sufficient smoothness, performs significantly better than the piecewise-linear basis. The underlying generalised sparse grid algorithm greedily selects the dimensions and variable interactions that contribute most to the variability of a function. The hierarchical surplus of points within the sparse grid is used as an error criterion for local refinement with the aim of concentrating computational effort within rapidly varying or discontinuous regions. This approach limits the number of points that are invested in `unimportant' dimensions and regions within the high-dimensional domain. We show the utility of the proposed method for non-smooth functions with hundreds of variables. △ Less

Submitted 30 September, 2011; originally announced October 2011.

Showing 1–31 of 31 results for author: Jakeman, J