Search | arXiv e-print repository

Verification and Validation for Trustworthy Scientific Machine Learning

Authors: John D. Jakeman, Lorena A. Barba, Joaquim R. R. A. Martins, Thomas O'Leary-Roseberry

Abstract: Scientific machine learning (SciML) models are transforming many scientific disciplines. However, the development of good modeling practices to increase the trustworthiness of SciML has lagged behind its application, limiting its potential impact. The goal of this paper is to start a discussion on establishing consensus-based good practices for predictive SciML. We identify key challenges in apply… ▽ More Scientific machine learning (SciML) models are transforming many scientific disciplines. However, the development of good modeling practices to increase the trustworthiness of SciML has lagged behind its application, limiting its potential impact. The goal of this paper is to start a discussion on establishing consensus-based good practices for predictive SciML. We identify key challenges in applying existing computational science and engineering guidelines, such as verification and validation protocols, and provide recommendations to address these challenges. Our discussion focuses on predictive SciML, which uses machine learning models to learn, improve, and accelerate numerical simulations of physical systems. While centered on predictive applications, our 16 recommendations aim to help researchers conduct and document their modeling processes rigorously across all SciML domains. △ Less

Submitted 25 April, 2025; v1 submitted 21 February, 2025; originally announced February 2025.

Report number: SAND2025-01935O MSC Class: 68T07; 68N30 ACM Class: I.6.4; I.6.5; G.4

arXiv:2412.06601 [pdf, other]

A switching Kalman filter approach to online mitigation and correction of sensor corruption for inertial navigation

Authors: Artem Mustaev, Nicholas Galioto, Matt Boler, John D. Jakeman, Cosmin Safta, Alex Gorodetsky

Abstract: This paper introduces a novel approach to detect and address faulty or corrupted external sensors in the context of inertial navigation by leveraging a switching Kalman Filter combined with parameter augmentation. Instead of discarding the corrupted data, the proposed method retains and processes it, running multiple observation models simultaneously and evaluating their likelihoods to accurately… ▽ More This paper introduces a novel approach to detect and address faulty or corrupted external sensors in the context of inertial navigation by leveraging a switching Kalman Filter combined with parameter augmentation. Instead of discarding the corrupted data, the proposed method retains and processes it, running multiple observation models simultaneously and evaluating their likelihoods to accurately identify the true state of the system. We demonstrate the effectiveness of this approach to both identify the moment that a sensor becomes faulty and to correct for the resulting sensor behavior to maintain accurate estimates. We demonstrate our approach on an application of balloon navigation in the atmosphere and shuttle reentry. The results show that our method can accurately recover the true system state even in the presence of significant sensor bias, thereby improving the robustness and reliability of state estimation systems under challenging conditions. We also provide a statistical analysis of problem settings to determine when and where our method is most accurate and where it fails. △ Less

Submitted 10 December, 2024; v1 submitted 9 December, 2024; originally announced December 2024.

arXiv:2407.00809 [pdf, other]

Kernel Neural Operators (KNOs) for Scalable, Memory-efficient, Geometrically-flexible Operator Learning

Authors: Matthew Lowery, John Turnage, Zachary Morrow, John D. Jakeman, Akil Narayan, Shandian Zhe, Varun Shankar

Abstract: This paper introduces the Kernel Neural Operator (KNO), a novel operator learning technique that uses deep kernel-based integral operators in conjunction with quadrature for function-space approximation of operators (maps from functions to functions). KNOs use parameterized, closed-form, finitely-smooth, and compactly-supported kernels with trainable sparsity parameters within the integral operato… ▽ More This paper introduces the Kernel Neural Operator (KNO), a novel operator learning technique that uses deep kernel-based integral operators in conjunction with quadrature for function-space approximation of operators (maps from functions to functions). KNOs use parameterized, closed-form, finitely-smooth, and compactly-supported kernels with trainable sparsity parameters within the integral operators to significantly reduce the number of parameters that must be learned relative to existing neural operators. Moreover, the use of quadrature for numerical integration endows the KNO with geometric flexibility that enables operator learning on irregular geometries. Numerical results demonstrate that on existing benchmarks the training and test accuracy of KNOs is higher than popular operator learning techniques while using at least an order of magnitude fewer trainable parameters. KNOs thus represent a new paradigm of low-memory, geometrically-flexible, deep operator learning, while retaining the implementation simplicity and transparency of traditional kernel methods from both scientific computing and machine learning. △ Less

Submitted 30 June, 2024; originally announced July 2024.

Comments: 10 pages + 5 page appendix, 8 figures

arXiv:2402.14736 [pdf, other]

Grouped approximate control variate estimators

Authors: Alex A. Gorodetsky, John D. Jakeman, Michael S. Eldred

Abstract: This paper analyzes the approximate control variate (ACV) approach to multifidelity uncertainty quantification in the case where weighted estimators are combined to form the components of the ACV. The weighted estimators enable one to precisely group models that share input samples to achieve improved variance reduction. We demonstrate that this viewpoint yields a generalized linear estimator that… ▽ More This paper analyzes the approximate control variate (ACV) approach to multifidelity uncertainty quantification in the case where weighted estimators are combined to form the components of the ACV. The weighted estimators enable one to precisely group models that share input samples to achieve improved variance reduction. We demonstrate that this viewpoint yields a generalized linear estimator that can assign any weight to any sample. This generalization shows that other linear estimators in the literature, particularly the multilevel best linear unbiased estimator (ML-BLUE) of Schaden and Ullman in 2020, becomes a specific version of the ACV estimator of Gorodetsky, Geraci, Jakeman, and Eldred, 2020. Moreover, this connection enables numerous extensions and insights. For example, we empirically show that having non-independent groups can yield better variance reduction compared to the independent groups used by ML-BLUE. Furthermore, we show that such grouped estimators can use arbitrary weighted estimators, not just the simple Monte Carlo estimators used in ML-BLUE. Furthermore, the analysis enables the derivation of ML-BLUE directly from a variance reduction perspective, rather than a regression perspective. △ Less

Submitted 22 February, 2024; originally announced February 2024.

Comments: 17 pages, 3 figures

arXiv:2402.13768 [pdf, other]

Democratizing Uncertainty Quantification

Authors: Linus Seelinger, Anne Reinarz, Mikkel B. Lykkegaard, Robert Akers, Amal M. A. Alghamdi, David Aristoff, Wolfgang Bangerth, Jean Bénézech, Matteo Diez, Kurt Frey, John D. Jakeman, Jakob S. Jørgensen, Ki-Tae Kim, Benjamin M. Kent, Massimiliano Martinelli, Matthew Parno, Riccardo Pellegrini, Noemi Petra, Nicolai A. B. Riis, Katherine Rosenfeld, Andrea Serani, Lorenzo Tamellini, Umberto Villa, Tim J. Dodwell, Robert Scheichl

Abstract: Uncertainty Quantification (UQ) is vital to safety-critical model-based analyses, but the widespread adoption of sophisticated UQ methods is limited by technical complexity. In this paper, we introduce UM-Bridge (the UQ and Modeling Bridge), a high-level abstraction and software protocol that facilitates universal interoperability of UQ software with simulation codes. It breaks down the technical… ▽ More Uncertainty Quantification (UQ) is vital to safety-critical model-based analyses, but the widespread adoption of sophisticated UQ methods is limited by technical complexity. In this paper, we introduce UM-Bridge (the UQ and Modeling Bridge), a high-level abstraction and software protocol that facilitates universal interoperability of UQ software with simulation codes. It breaks down the technical complexity of advanced UQ applications and enables separation of concerns between experts. UM-Bridge democratizes UQ by allowing effective interdisciplinary collaboration, accelerating the development of advanced UQ methods, and making it easy to perform UQ analyses from prototype to High Performance Computing (HPC) scale. In addition, we present a library of ready-to-run UQ benchmark problems, all easily accessible through UM-Bridge. These benchmarks support UQ methodology research, enabling reproducible performance comparisons. We demonstrate UM-Bridge with several scientific applications, harnessing HPC resources even using UQ codes not designed with HPC support. △ Less

Submitted 9 September, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: Add Benjamin Kent as co-author in accordance with the paper's published version

arXiv:2104.11079 [pdf, other]

doi 10.2172/1807223

Randomized Algorithms for Scientific Computing (RASC)

Authors: Aydin Buluc, Tamara G. Kolda, Stefan M. Wild, Mihai Anitescu, Anthony DeGennaro, John Jakeman, Chandrika Kamath, Ramakrishnan Kannan, Miles E. Lopes, Per-Gunnar Martinsson, Kary Myers, Jelani Nelson, Juan M. Restrepo, C. Seshadhri, Draguna Vrabie, Brendt Wohlberg, Stephen J. Wright, Chao Yang, Peter Zwart

Abstract: Randomized algorithms have propelled advances in artificial intelligence and represent a foundational research area in advancing AI for Science. Future advancements in DOE Office of Science priority areas such as climate science, astrophysics, fusion, advanced materials, combustion, and quantum computing all require randomized algorithms for surmounting challenges of complexity, robustness, and sc… ▽ More Randomized algorithms have propelled advances in artificial intelligence and represent a foundational research area in advancing AI for Science. Future advancements in DOE Office of Science priority areas such as climate science, astrophysics, fusion, advanced materials, combustion, and quantum computing all require randomized algorithms for surmounting challenges of complexity, robustness, and scalability. This report summarizes the outcomes of that workshop, "Randomized Algorithms for Scientific Computing (RASC)," held virtually across four days in December 2020 and January 2021. △ Less

Submitted 21 March, 2022; v1 submitted 19 April, 2021; originally announced April 2021.

arXiv:2008.02672 [pdf, other]

MFNets: Data efficient all-at-once learning of multifidelity surrogates as directed networks of information sources

Authors: Alex Gorodetsky, John D. Jakeman, Gianluca Geraci

Abstract: We present an approach for constructing a surrogate from ensembles of information sources of varying cost and accuracy. The multifidelity surrogate encodes connections between information sources as a directed acyclic graph, and is trained via gradient-based minimization of a nonlinear least squares objective. While the vast majority of state-of-the-art assumes hierarchical connections between inf… ▽ More We present an approach for constructing a surrogate from ensembles of information sources of varying cost and accuracy. The multifidelity surrogate encodes connections between information sources as a directed acyclic graph, and is trained via gradient-based minimization of a nonlinear least squares objective. While the vast majority of state-of-the-art assumes hierarchical connections between information sources, our approach works with flexibly structured information sources that may not admit a strict hierarchy. The formulation has two advantages: (1) increased data efficiency due to parsimonious multifidelity networks that can be tailored to the application; and (2) no constraints on the training data -- we can combine noisy, non-nested evaluations of the information sources. Numerical examples ranging from synthetic to physics-based computational mechanics simulations indicate the error in our approach can be orders-of-magnitude smaller, particularly in the low-data regime, than single-fidelity and hierarchical multifidelity approaches. △ Less

Submitted 23 August, 2021; v1 submitted 3 August, 2020; originally announced August 2020.

Comments: 24 pages

MSC Class: 62J02; 65D15; 41A10

arXiv:2006.09319 [pdf, other]

doi 10.1615/JMachLearnModelComput.2020035155

A Survey of Constrained Gaussian Process Regression: Approaches and Implementation Challenges

Authors: Laura Swiler, Mamikon Gulian, Ari Frankel, Cosmin Safta, John Jakeman

Abstract: Gaussian process regression is a popular Bayesian framework for surrogate modeling of expensive data sources. As part of a broader effort in scientific machine learning, many recent works have incorporated physical constraints or other a priori information within Gaussian process regression to supplement limited data and regularize the behavior of the model. We provide an overview and survey of se… ▽ More Gaussian process regression is a popular Bayesian framework for surrogate modeling of expensive data sources. As part of a broader effort in scientific machine learning, many recent works have incorporated physical constraints or other a priori information within Gaussian process regression to supplement limited data and regularize the behavior of the model. We provide an overview and survey of several classes of Gaussian process constraints, including positivity or bound constraints, monotonicity and convexity constraints, differential equation constraints provided by linear PDEs, and boundary condition constraints. We compare the strategies behind each approach as well as the differences in implementation, concluding with a discussion of the computational challenges introduced by constraints. △ Less

Submitted 6 January, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

Comments: 42 pages, 3 figures. Version 3: DOI & Reference added; appeared in Journal of Machine Learning for Modeling and Computing. Version 2 includes minor additions, clarifications and improvements to notation

Journal ref: Journal of Machine Learning for Modeling and Computing, 1(2):119-156 (2020)

arXiv:2006.02392 [pdf, other]

Data-driven learning of non-autonomous systems

Authors: Tong Qin, Zhen Chen, John Jakeman, Dongbin Xiu

Abstract: We present a numerical framework for recovering unknown non-autonomous dynamical systems with time-dependent inputs. To circumvent the difficulty presented by the non-autonomous nature of the system, our method transforms the solution state into piecewise integration of the system over a discrete set of time instances. The time-dependent inputs are then locally parameterized by using a proper mode… ▽ More We present a numerical framework for recovering unknown non-autonomous dynamical systems with time-dependent inputs. To circumvent the difficulty presented by the non-autonomous nature of the system, our method transforms the solution state into piecewise integration of the system over a discrete set of time instances. The time-dependent inputs are then locally parameterized by using a proper model, for example, polynomial regression, in the pieces determined by the time instances. This transforms the original system into a piecewise parametric system that is locally time invariant. We then design a deep neural network structure to learn the local models. Once the network model is constructed, it can be iteratively used over time to conduct global system prediction. We provide theoretical analysis of our algorithm and present a number of numerical examples to demonstrate the effectiveness of the method. △ Less

Submitted 2 June, 2020; originally announced June 2020.

arXiv:1110.0010 [pdf, other]

Local and Dimension Adaptive Sparse Grid Interpolation and Quadrature

Authors: John D. Jakeman, Stephen G. Roberts

Abstract: In this paper we present a locally and dimension-adaptive sparse grid method for interpolation and integration of high-dimensional functions with discontinuities. The proposed algorithm combines the strengths of the generalised sparse grid algorithm and hierarchical surplus-guided local adaptivity. A high-degree basis is used to obtain a high-order method which, given sufficient smoothness, perfor… ▽ More In this paper we present a locally and dimension-adaptive sparse grid method for interpolation and integration of high-dimensional functions with discontinuities. The proposed algorithm combines the strengths of the generalised sparse grid algorithm and hierarchical surplus-guided local adaptivity. A high-degree basis is used to obtain a high-order method which, given sufficient smoothness, performs significantly better than the piecewise-linear basis. The underlying generalised sparse grid algorithm greedily selects the dimensions and variable interactions that contribute most to the variability of a function. The hierarchical surplus of points within the sparse grid is used as an error criterion for local refinement with the aim of concentrating computational effort within rapidly varying or discontinuous regions. This approach limits the number of points that are invested in `unimportant' dimensions and regions within the high-dimensional domain. We show the utility of the proposed method for non-smooth functions with hundreds of variables. △ Less

Submitted 30 September, 2011; originally announced October 2011.

Showing 1–10 of 10 results for author: Jakeman, J