Kernel Methods are Competitive for Operator Learning
Authors:
Pau Batlle,
Matthieu Darcy,
Bamdad Hosseini,
Houman Owhadi
Abstract:
We present a general kernel-based framework for learning operators between Banach spaces along with a priori error analysis and comprehensive numerical comparisons with popular neural net (NN) approaches such as Deep Operator Net (DeepONet) [Lu et al.] and Fourier Neural Operator (FNO) [Li et al.]. We consider the setting where the input/output spaces of target operator…
▽ More
We present a general kernel-based framework for learning operators between Banach spaces along with a priori error analysis and comprehensive numerical comparisons with popular neural net (NN) approaches such as Deep Operator Net (DeepONet) [Lu et al.] and Fourier Neural Operator (FNO) [Li et al.]. We consider the setting where the input/output spaces of target operator $\mathcal{G}^\dagger\,:\, \mathcal{U}\to \mathcal{V}$ are reproducing kernel Hilbert spaces (RKHS), the data comes in the form of partial observations $φ(u_i), \varphi(v_i)$ of input/output functions $v_i=\mathcal{G}^\dagger(u_i)$ ($i=1,\ldots,N$), and the measurement operators $φ\,:\, \mathcal{U}\to \mathbb{R}^n$ and $\varphi\,:\, \mathcal{V} \to \mathbb{R}^m$ are linear. Writing $ψ\,:\, \mathbb{R}^n \to \mathcal{U}$ and $χ\,:\, \mathbb{R}^m \to \mathcal{V}$ for the optimal recovery maps associated with $φ$ and $\varphi$, we approximate $\mathcal{G}^\dagger$ with $\bar{\mathcal{G}}=χ\circ \bar{f} \circ φ$ where $\bar{f}$ is an optimal recovery approximation of $f^\dagger:=\varphi \circ \mathcal{G}^\dagger \circ ψ\,:\,\mathbb{R}^n \to \mathbb{R}^m$. We show that, even when using vanilla kernels (e.g., linear or Matérn), our approach is competitive in terms of cost-accuracy trade-off and either matches or beats the performance of NN methods on a majority of benchmarks. Additionally, our framework offers several advantages inherited from kernel methods: simplicity, interpretability, convergence guarantees, a priori error estimates, and Bayesian uncertainty quantification. As such, it can serve as a natural benchmark for operator learning.
△ Less
Submitted 8 October, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
One-Shot Learning of Stochastic Differential Equations with Data Adapted Kernels
Authors:
Matthieu Darcy,
Boumediene Hamzi,
Giulia Livieri,
Houman Owhadi,
Peyman Tavallali
Abstract:
We consider the problem of learning Stochastic Differential Equations of the form $dX_t = f(X_t)dt+σ(X_t)dW_t $ from one sample trajectory. This problem is more challenging than learning deterministic dynamical systems because one sample trajectory only provides indirect information on the unknown functions $f$, $σ$, and stochastic process $dW_t$ representing the drift, the diffusion, and the stoc…
▽ More
We consider the problem of learning Stochastic Differential Equations of the form $dX_t = f(X_t)dt+σ(X_t)dW_t $ from one sample trajectory. This problem is more challenging than learning deterministic dynamical systems because one sample trajectory only provides indirect information on the unknown functions $f$, $σ$, and stochastic process $dW_t$ representing the drift, the diffusion, and the stochastic forcing terms, respectively. We propose a method that combines Computational Graph Completion and data adapted kernels learned via a new variant of cross validation. Our approach can be decomposed as follows: (1) Represent the time-increment map $X_t \rightarrow X_{t+dt}$ as a Computational Graph in which $f$, $σ$ and $dW_t$ appear as unknown functions and random variables. (2) Complete the graph (approximate unknown functions and random variables) via Maximum a Posteriori Estimation (given the data) with Gaussian Process (GP) priors on the unknown functions. (3) Learn the covariance functions (kernels) of the GP priors from data with randomized cross-validation. Numerical experiments illustrate the efficacy, robustness, and scope of our method.
△ Less
Submitted 1 December, 2022; v1 submitted 24 September, 2022;
originally announced September 2022.