-
Statistical accuracy of the ensemble Kalman filter in the near-linear setting
Authors:
E. Calvello,
J. A. Carrillo,
F. Hoffmann,
P. Monmarché,
A. M. Stuart,
U. Vaes
Abstract:
Estimating the state of a dynamical system from partial and noisy observations is a ubiquitous problem in a large number of applications, such as probabilistic weather forecasting and prediction of epidemics. Particle filters are a widely adopted approach to the problem and provide provably accurate approximations of the statistics of the state, but they perform poorly in high dimensions because o…
▽ More
Estimating the state of a dynamical system from partial and noisy observations is a ubiquitous problem in a large number of applications, such as probabilistic weather forecasting and prediction of epidemics. Particle filters are a widely adopted approach to the problem and provide provably accurate approximations of the statistics of the state, but they perform poorly in high dimensions because of weight collapse. The ensemble Kalman filter does not suffer from this issue, as it relies on an interacting particle system with equal weights. Despite its wide adoption in the geophysical sciences, mathematical analysis of the accuracy of this filter is predominantly confined to the setting of linear dynamical models and linear observations operators, and analysis beyond the linear Gaussian setting is still in its infancy. In this short note, we provide an accessible overview of recent work in which the authors take first steps to analyze the accuracy of the filter beyond the linear Gaussian setting.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
Solving Roughly Forced Nonlinear PDEs via Misspecified Kernel Methods and Neural Networks
Authors:
Ricardo Baptista,
Edoardo Calvello,
Matthieu Darcy,
Houman Owhadi,
Andrew M. Stuart,
Xianjin Yang
Abstract:
We consider the use of Gaussian Processes (GPs) or Neural Networks (NNs) to numerically approximate the solutions to nonlinear partial differential equations (PDEs) with rough forcing or source terms, which commonly arise as pathwise solutions to stochastic PDEs. Kernel methods have recently been generalized to solve nonlinear PDEs by approximating their solutions as the maximum a posteriori estim…
▽ More
We consider the use of Gaussian Processes (GPs) or Neural Networks (NNs) to numerically approximate the solutions to nonlinear partial differential equations (PDEs) with rough forcing or source terms, which commonly arise as pathwise solutions to stochastic PDEs. Kernel methods have recently been generalized to solve nonlinear PDEs by approximating their solutions as the maximum a posteriori estimator of GPs that are conditioned to satisfy the PDE at a finite set of collocation points. The convergence and error guarantees of these methods, however, rely on the PDE being defined in a classical sense and its solution possessing sufficient regularity to belong to the associated reproducing kernel Hilbert space. We propose a generalization of these methods to handle roughly forced nonlinear PDEs while preserving convergence guarantees with an oversmoothing GP kernel that is misspecified relative to the true solution's regularity. This is achieved by conditioning a regular GP to satisfy the PDE with a modified source term in a weak sense (when integrated against a finite number of test functions). This is equivalent to replacing the empirical $L^2$-loss on the PDE constraint by an empirical negative-Sobolev norm. We further show that this loss function can be used to extend physics-informed neural networks (PINNs) to stochastic equations, thereby resulting in a new NN-based variant termed Negative Sobolev Norm-PINN (NeS-PINN).
△ Less
Submitted 29 January, 2025; v1 submitted 28 January, 2025;
originally announced January 2025.
-
Accuracy of the Ensemble Kalman Filter in the Near-Linear Setting
Authors:
Edoardo Calvello,
Pierre Monmarché,
Andrew M. Stuart,
Urbain Vaes
Abstract:
The filtering distribution captures the statistics of the state of a dynamical system from partial and noisy observations. Classical particle filters provably approximate this distribution in quite general settings; however they behave poorly for high dimensional problems, suffering weight collapse. This issue is circumvented by the ensemble Kalman filter which is an equal-weight interacting parti…
▽ More
The filtering distribution captures the statistics of the state of a dynamical system from partial and noisy observations. Classical particle filters provably approximate this distribution in quite general settings; however they behave poorly for high dimensional problems, suffering weight collapse. This issue is circumvented by the ensemble Kalman filter which is an equal-weight interacting particle system. However, this finite particle system is only proven to approximate the true filter in the linear Gaussian case. In practice, however, it is applied in much broader settings; as a result, establishing its approximation properties more generally is important. There has been recent progress in the theoretical analysis of the algorithm, establishing stability and error estimates in non-Gaussian settings, but the assumptions on the dynamics and observation models rule out the unbounded vector fields that arise in practice and the analysis applies only to the mean field limit of the ensemble Kalman filter. The present work establishes error bounds between the filtering distribution and the finite particle ensemble Kalman filter when the dynamics and observation vector fields may be unbounded, allowing linear growth.
△ Less
Submitted 6 February, 2025; v1 submitted 15 September, 2024;
originally announced September 2024.
-
Continuum Attention for Neural Operators
Authors:
Edoardo Calvello,
Nikola B. Kovachki,
Matthew E. Levine,
Andrew M. Stuart
Abstract:
Transformers, and the attention mechanism in particular, have become ubiquitous in machine learning. Their success in modeling nonlocal, long-range correlations has led to their widespread adoption in natural language processing, computer vision, and time-series problems. Neural operators, which map spaces of functions into spaces of functions, are necessarily both nonlinear and nonlocal if they a…
▽ More
Transformers, and the attention mechanism in particular, have become ubiquitous in machine learning. Their success in modeling nonlocal, long-range correlations has led to their widespread adoption in natural language processing, computer vision, and time-series problems. Neural operators, which map spaces of functions into spaces of functions, are necessarily both nonlinear and nonlocal if they are universal; it is thus natural to ask whether the attention mechanism can be used in the design of neural operators. Motivated by this, we study transformers in the function space setting. We formulate attention as a map between infinite dimensional function spaces and prove that the attention mechanism as implemented in practice is a Monte Carlo or finite difference approximation of this operator. The function space formulation allows for the design of transformer neural operators, a class of architectures designed to learn mappings between function spaces, for which we prove a universal approximation result. The prohibitive cost of applying the attention operator to functions defined on multi-dimensional domains leads to the need for more efficient attention-based architectures. For this reason we also introduce a function space generalization of the patching strategy from computer vision, and introduce a class of associated neural operators. Numerical results, on an array of operator learning problems, demonstrate the promise of our approaches to function space formulations of attention and their use in neural operators.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Ensemble Kalman Methods: A Mean Field Perspective
Authors:
Edoardo Calvello,
Sebastian Reich,
Andrew M. Stuart
Abstract:
Ensemble Kalman methods are widely used for state estimation in the geophysical sciences. Their success stems from the fact that they take an underlying (possibly noisy) dynamical system as a black box to provide a systematic, derivative-free methodology for incorporating noisy, partial and possibly indirect observations to update estimates of the state; furthermore the ensemble approach allows fo…
▽ More
Ensemble Kalman methods are widely used for state estimation in the geophysical sciences. Their success stems from the fact that they take an underlying (possibly noisy) dynamical system as a black box to provide a systematic, derivative-free methodology for incorporating noisy, partial and possibly indirect observations to update estimates of the state; furthermore the ensemble approach allows for sensitivities and uncertainties to be calculated. The methodology was introduced in 1994 in the context of ocean state estimation. Soon thereafter it was adopted by the numerical weather prediction community and is now a key component of the best weather prediction systems worldwide. Furthermore the methodology is starting to be widely adopted for numerous problems in the geophysical sciences and is being developed as the basis for general purpose derivative-free inversion methods that show great promise. Despite this empirical success, analysis of the accuracy of ensemble Kalman methods, in terms of their capabilities as both state estimators and quantifiers of uncertainty, is lagging. The purpose of this paper is to provide a unifying mean field based framework for the derivation and analysis of ensemble Kalman methods. Both state estimation and parameter estimation problems (inverse problems) are considered, and formulations in both discrete and continuous time are employed. For state estimation problems, both the control and filtering approaches are considered; analogously for parameter estimation problems, the optimization and Bayesian perspectives are both studied. The mean field perspective provides an elegant framework, suitable for analysis; furthermore, a variety of methods used in practice can be derived from mean field systems by using interacting particle system approximations. The approach taken also unifies a wide-ranging literature in the field and suggests open problems.
△ Less
Submitted 7 October, 2024; v1 submitted 22 September, 2022;
originally announced September 2022.