-
Fast filtering of non-Gaussian models using Amortized Optimal Transport Maps
Authors:
Mohammad Al-Jarrah,
Bamdad Hosseini,
Amirhossein Taghvaei
Abstract:
In this paper, we present the amortized optimal transport filter (A-OTF) designed to mitigate the computational burden associated with the real-time training of optimal transport filters (OTFs). OTFs can perform accurate non-Gaussian Bayesian updates in the filtering procedure, but they require training at every time step, which makes them expensive. The proposed A-OTF framework exploits the simil…
▽ More
In this paper, we present the amortized optimal transport filter (A-OTF) designed to mitigate the computational burden associated with the real-time training of optimal transport filters (OTFs). OTFs can perform accurate non-Gaussian Bayesian updates in the filtering procedure, but they require training at every time step, which makes them expensive. The proposed A-OTF framework exploits the similarity between OTF maps during an initial/offline training stage in order to reduce the cost of inference during online calculations. More precisely, we use clustering algorithms to select relevant subsets of pre-trained maps whose weighted average is used to compute the A-OTF model akin to a mixture of experts. A series of numerical experiments validate that A-OTF achieves substantial computational savings during online inference while preserving the inherent flexibility and accuracy of OTF.
△ Less
Submitted 20 May, 2025; v1 submitted 16 March, 2025;
originally announced March 2025.
-
Data-Efficient Kernel Methods for Learning Differential Equations and Their Solution Operators: Algorithms and Error Analysis
Authors:
Yasamin Jalalian,
Juan Felipe Osorio Ramirez,
Alexander Hsu,
Bamdad Hosseini,
Houman Owhadi
Abstract:
We introduce a novel kernel-based framework for learning differential equations and their solution maps that is efficient in data requirements, in terms of solution examples and amount of measurements from each example, and computational cost, in terms of training procedures. Our approach is mathematically interpretable and backed by rigorous theoretical guarantees in the form of quantitative wors…
▽ More
We introduce a novel kernel-based framework for learning differential equations and their solution maps that is efficient in data requirements, in terms of solution examples and amount of measurements from each example, and computational cost, in terms of training procedures. Our approach is mathematically interpretable and backed by rigorous theoretical guarantees in the form of quantitative worst-case error bounds for the learned equation. Numerical benchmarks demonstrate significant improvements in computational complexity and robustness while achieving one to two orders of magnitude improvements in terms of accuracy compared to state-of-the-art algorithms.
△ Less
Submitted 4 April, 2025; v1 submitted 2 March, 2025;
originally announced March 2025.
-
Gaussian Measures Conditioned on Nonlinear Observations: Consistency, MAP Estimators, and Simulation
Authors:
Yifan Chen,
Bamdad Hosseini,
Houman Owhadi,
Andrew M Stuart
Abstract:
The article presents a systematic study of the problem of conditioning a Gaussian random variable $ξ$ on nonlinear observations of the form $F \circ φ(ξ)$ where $φ: \mathcal{X} \to \mathbb{R}^N$ is a bounded linear operator and $F$ is nonlinear. Such problems arise in the context of Bayesian inference and recent machine learning-inspired PDE solvers. We give a representer theorem for the condition…
▽ More
The article presents a systematic study of the problem of conditioning a Gaussian random variable $ξ$ on nonlinear observations of the form $F \circ φ(ξ)$ where $φ: \mathcal{X} \to \mathbb{R}^N$ is a bounded linear operator and $F$ is nonlinear. Such problems arise in the context of Bayesian inference and recent machine learning-inspired PDE solvers. We give a representer theorem for the conditioned random variable $ξ\mid F\circ φ(ξ)$, stating that it decomposes as the sum of an infinite-dimensional Gaussian (which is identified analytically) as well as a finite-dimensional non-Gaussian measure. We also introduce a novel notion of the mode of a conditional measure by taking the limit of the natural relaxation of the problem, to which we can apply the existing notion of maximum a posteriori estimators of posterior measures. Finally, we introduce a variant of the Laplace approximation for the efficient simulation of the aforementioned conditioned Gaussian random variables towards uncertainty quantification.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Data-Driven Approximation of Stationary Nonlinear Filters with Optimal Transport Maps
Authors:
Mohammad Al-Jarrah,
Bamdad Hosseini,
Amirhossein Taghvaei
Abstract:
The nonlinear filtering problem is concerned with finding the conditional probability distribution (posterior) of the state of a stochastic dynamical system, given a history of partial and noisy observations. This paper presents a data-driven nonlinear filtering algorithm for the case when the state and observation processes are stationary. The posterior is approximated as the push-forward of an o…
▽ More
The nonlinear filtering problem is concerned with finding the conditional probability distribution (posterior) of the state of a stochastic dynamical system, given a history of partial and noisy observations. This paper presents a data-driven nonlinear filtering algorithm for the case when the state and observation processes are stationary. The posterior is approximated as the push-forward of an optimal transport (OT) map from a given distribution, that is easy to sample from, to the posterior conditioned on a truncated observation window. The OT map is obtained as the solution to a stochastic optimization problem that is solved offline using recorded trajectory data from the state and observations. An error analysis of the algorithm is presented under the stationarity and filter stability assumptions, which decomposes the error into two parts related to the truncation window during training and the error due to the optimization procedure. The performance of the proposed method, referred to as optimal transport data-driven filter (OT-DDF), is evaluated for several numerical examples, highlighting its significant computational efficiency during the online stage while maintaining the flexibility and accuracy of OT methods in nonlinear filtering.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
Diffeomorphic Measure Matching with Kernels for Generative Modeling
Authors:
Biraj Pandey,
Bamdad Hosseini,
Pau Batlle,
Houman Owhadi
Abstract:
This article presents a general framework for the transport of probability measures towards minimum divergence generative modeling and sampling using ordinary differential equations (ODEs) and Reproducing Kernel Hilbert Spaces (RKHSs), inspired by ideas from diffeomorphic matching and image registration. A theoretical analysis of the proposed method is presented, giving a priori error bounds in te…
▽ More
This article presents a general framework for the transport of probability measures towards minimum divergence generative modeling and sampling using ordinary differential equations (ODEs) and Reproducing Kernel Hilbert Spaces (RKHSs), inspired by ideas from diffeomorphic matching and image registration. A theoretical analysis of the proposed method is presented, giving a priori error bounds in terms of the complexity of the model, the number of samples in the training set, and model misspecification. An extensive suite of numerical experiments further highlights the properties, strengths, and weaknesses of the method and extends its applicability to other tasks, such as conditional simulation and inference.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Conditional Optimal Transport on Function Spaces
Authors:
Bamdad Hosseini,
Alexander W. Hsu,
Amirhossein Taghvaei
Abstract:
We present a systematic study of conditional triangular transport maps in function spaces from the perspective of optimal transportation and with a view towards amortized Bayesian inference. More specifically, we develop a theory of constrained optimal transport problems that describe block-triangular Monge maps that characterize conditional measures along with their Kantorovich relaxations. This…
▽ More
We present a systematic study of conditional triangular transport maps in function spaces from the perspective of optimal transportation and with a view towards amortized Bayesian inference. More specifically, we develop a theory of constrained optimal transport problems that describe block-triangular Monge maps that characterize conditional measures along with their Kantorovich relaxations. This generalizes the theory of optimal triangular transport to separable infinite-dimensional function spaces with general cost functions. We further tailor our results to the case of Bayesian inference problems and obtain regularity estimates on the conditioning maps from the prior to the posterior. Finally, we present numerical experiments that demonstrate the computational applicability of our theoretical results for amortized and likelihood-free inference of functional parameters.
△ Less
Submitted 6 February, 2024; v1 submitted 9 November, 2023;
originally announced November 2023.
-
Nonlinear Filtering with Brenier Optimal Transport Maps
Authors:
Mohammad Al-Jarrah,
Niyizhen Jin,
Bamdad Hosseini,
Amirhossein Taghvaei
Abstract:
This paper is concerned with the problem of nonlinear filtering, i.e., computing the conditional distribution of the state of a stochastic dynamical system given a history of noisy partial observations. Conventional sequential importance resampling (SIR) particle filters suffer from fundamental limitations, in scenarios involving degenerate likelihoods or high-dimensional states, due to the weight…
▽ More
This paper is concerned with the problem of nonlinear filtering, i.e., computing the conditional distribution of the state of a stochastic dynamical system given a history of noisy partial observations. Conventional sequential importance resampling (SIR) particle filters suffer from fundamental limitations, in scenarios involving degenerate likelihoods or high-dimensional states, due to the weight degeneracy issue. In this paper, we explore an alternative method, which is based on estimating the Brenier optimal transport (OT) map from the current prior distribution of the state to the posterior distribution at the next time step. Unlike SIR particle filters, the OT formulation does not require the analytical form of the likelihood. Moreover, it allows us to harness the approximation power of neural networks to model complex and multi-modal distributions and employ stochastic optimization algorithms to enhance scalability. Extensive numerical experiments are presented that compare the OT method to the SIR particle filter and the ensemble Kalman filter, evaluating the performance in terms of sample efficiency, high-dimensional scalability, and the ability to capture complex and multi-modal distributions.
△ Less
Submitted 2 February, 2024; v1 submitted 20 October, 2023;
originally announced October 2023.
-
Error Analysis of Kernel/GP Methods for Nonlinear and Parametric PDEs
Authors:
Pau Batlle,
Yifan Chen,
Bamdad Hosseini,
Houman Owhadi,
Andrew M Stuart
Abstract:
We introduce a priori Sobolev-space error estimates for the solution of nonlinear, and possibly parametric, PDEs using Gaussian process and kernel based methods. The primary assumptions are: (1) a continuous embedding of the reproducing kernel Hilbert space of the kernel into a Sobolev space of sufficient regularity; and (2) the stability of the differential operator and the solution map of the PD…
▽ More
We introduce a priori Sobolev-space error estimates for the solution of nonlinear, and possibly parametric, PDEs using Gaussian process and kernel based methods. The primary assumptions are: (1) a continuous embedding of the reproducing kernel Hilbert space of the kernel into a Sobolev space of sufficient regularity; and (2) the stability of the differential operator and the solution map of the PDE between corresponding Sobolev spaces. The proof is articulated around Sobolev norm error estimates for kernel interpolants and relies on the minimizing norm property of the solution. The error estimates demonstrate dimension-benign convergence rates if the solution space of the PDE is smooth enough. We illustrate these points with applications to high-dimensional nonlinear elliptic PDEs and parametric PDEs. Although some recent machine learning methods have been presented as breaking the curse of dimensionality in solving high-dimensional PDEs, our analysis suggests a more nuanced picture: there is a trade-off between the regularity of the solution and the presence of the curse of dimensionality. Therefore, our results are in line with the understanding that the curse is absent when the solution is regular enough.
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
Kernel Methods are Competitive for Operator Learning
Authors:
Pau Batlle,
Matthieu Darcy,
Bamdad Hosseini,
Houman Owhadi
Abstract:
We present a general kernel-based framework for learning operators between Banach spaces along with a priori error analysis and comprehensive numerical comparisons with popular neural net (NN) approaches such as Deep Operator Net (DeepONet) [Lu et al.] and Fourier Neural Operator (FNO) [Li et al.]. We consider the setting where the input/output spaces of target operator…
▽ More
We present a general kernel-based framework for learning operators between Banach spaces along with a priori error analysis and comprehensive numerical comparisons with popular neural net (NN) approaches such as Deep Operator Net (DeepONet) [Lu et al.] and Fourier Neural Operator (FNO) [Li et al.]. We consider the setting where the input/output spaces of target operator $\mathcal{G}^\dagger\,:\, \mathcal{U}\to \mathcal{V}$ are reproducing kernel Hilbert spaces (RKHS), the data comes in the form of partial observations $φ(u_i), \varphi(v_i)$ of input/output functions $v_i=\mathcal{G}^\dagger(u_i)$ ($i=1,\ldots,N$), and the measurement operators $φ\,:\, \mathcal{U}\to \mathbb{R}^n$ and $\varphi\,:\, \mathcal{V} \to \mathbb{R}^m$ are linear. Writing $ψ\,:\, \mathbb{R}^n \to \mathcal{U}$ and $χ\,:\, \mathbb{R}^m \to \mathcal{V}$ for the optimal recovery maps associated with $φ$ and $\varphi$, we approximate $\mathcal{G}^\dagger$ with $\bar{\mathcal{G}}=χ\circ \bar{f} \circ φ$ where $\bar{f}$ is an optimal recovery approximation of $f^\dagger:=\varphi \circ \mathcal{G}^\dagger \circ ψ\,:\,\mathbb{R}^n \to \mathbb{R}^m$. We show that, even when using vanilla kernels (e.g., linear or Matérn), our approach is competitive in terms of cost-accuracy trade-off and either matches or beats the performance of NN methods on a majority of benchmarks. Additionally, our framework offers several advantages inherited from kernel methods: simplicity, interpretability, convergence guarantees, a priori error estimates, and Bayesian uncertainty quantification. As such, it can serve as a natural benchmark for operator learning.
△ Less
Submitted 8 October, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
Optimal Transport Particle Filters
Authors:
Mohammad Al-Jarrah,
Bamdad Hosseini,
Amirhossein Taghvaei
Abstract:
This paper is concerned with the theoretical and computational development of a new class of nonlinear filtering algorithms called the optimal transport particle filters (OTPF). The algorithm is based on a recently introduced variational formulation of the Bayes' rule, which aims to find the Brenier optimal transport map between the prior and the posterior distributions as the solution to a stocha…
▽ More
This paper is concerned with the theoretical and computational development of a new class of nonlinear filtering algorithms called the optimal transport particle filters (OTPF). The algorithm is based on a recently introduced variational formulation of the Bayes' rule, which aims to find the Brenier optimal transport map between the prior and the posterior distributions as the solution to a stochastic optimization problem. On the theoretical side, the existing methods for the error analysis of particle filters and stability results for optimal transport map estimation are combined to obtain uniform error bounds for the filter's performance in terms of the optimization gap in solving the variational problem. The error analysis reveals a bias-variance trade-off that can ultimately be used to understand if/when the curse of dimensionality can be avoided in these filters. On the computational side, the proposed algorithm is evaluated on a nonlinear filtering example in comparison with the ensemble Kalman filter (EnKF) and the sequential importance resampling (SIR) particle filter.
△ Less
Submitted 1 April, 2023;
originally announced April 2023.
-
Bayesian Posterior Perturbation Analysis with Integral Probability Metrics
Authors:
Alfredo Garbuno-Inigo,
Tapio Helin,
Franca Hoffmann,
Bamdad Hosseini
Abstract:
In recent years, Bayesian inference in large-scale inverse problems found in science, engineering and machine learning has gained significant attention. This paper examines the robustness of the Bayesian approach by analyzing the stability of posterior measures in relation to perturbations in the likelihood potential and the prior measure. We present new stability results using a family of integra…
▽ More
In recent years, Bayesian inference in large-scale inverse problems found in science, engineering and machine learning has gained significant attention. This paper examines the robustness of the Bayesian approach by analyzing the stability of posterior measures in relation to perturbations in the likelihood potential and the prior measure. We present new stability results using a family of integral probability metrics (divergences) akin to dual problems that arise in optimal transport. Our results stand out from previous works in three directions: (1) We construct new families of integral probability metrics that are adapted to the problem at hand; (2) These new metrics allow us to study both likelihood and prior perturbations in a convenient way; and (3) our analysis accommodates likelihood potentials that are only locally Lipschitz, making them applicable to a wide range of nonlinear inverse problems. Our theoretical findings are further reinforced through specific and novel examples where the approximation rates of posterior measures are obtained for different types of perturbations and provide a path towards the convergence analysis of recently adapted machine learning techniques for Bayesian inverse problems such as data-driven priors and neural network surrogates.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
An Approximation Theory Framework for Measure-Transport Sampling Algorithms
Authors:
Ricardo Baptista,
Bamdad Hosseini,
Nikola B. Kovachki,
Youssef M. Marzouk,
Amir Sagiv
Abstract:
This article presents a general approximation-theoretic framework to analyze measure transport algorithms for probabilistic modeling. A primary motivating application for such algorithms is sampling -- a central task in statistical inference and generative modeling. We provide a priori error estimates in the continuum limit, i.e., when the measures (or their densities) are given, but when the tran…
▽ More
This article presents a general approximation-theoretic framework to analyze measure transport algorithms for probabilistic modeling. A primary motivating application for such algorithms is sampling -- a central task in statistical inference and generative modeling. We provide a priori error estimates in the continuum limit, i.e., when the measures (or their densities) are given, but when the transport map is discretized or approximated using a finite-dimensional function space. Our analysis relies on the regularity theory of transport maps and on classical approximation theory for high-dimensional functions. A third element of our analysis, which is of independent interest, is the development of new stability estimates that relate the distance between two maps to the distance~(or divergence) between the pushforward measures they define. We present a series of applications of our framework, where quantitative convergence rates are obtained for practical problems using Wasserstein metrics, maximum mean discrepancy, and Kullback--Leibler divergence. Specialized rates for approximations of the popular triangular Kn{ö}the-Rosenblatt maps are obtained, followed by numerical experiments that demonstrate and extend our theory.
△ Less
Submitted 18 September, 2024; v1 submitted 27 February, 2023;
originally announced February 2023.
-
Intrinsic Sparsity of Kantorovich Solutions
Authors:
Bamdad Hosseini,
Stefan Steinerberger
Abstract:
Let $X,Y$ be two finite sets of points having $\#X = m$ and $\#Y = n$ points with $μ= (1/m) \sum_{i=1}^{m} δ_{x_i}$ and $ν= (1/n) \sum_{j=1}^{n} δ_{y_j}$ being the associated uniform probability measures. A result of Birkhoff implies that if $m = n$, then the Kantorovich problem has a solution which also solves the Monge problem: optimal transport can be realized with a bijection…
▽ More
Let $X,Y$ be two finite sets of points having $\#X = m$ and $\#Y = n$ points with $μ= (1/m) \sum_{i=1}^{m} δ_{x_i}$ and $ν= (1/n) \sum_{j=1}^{n} δ_{y_j}$ being the associated uniform probability measures. A result of Birkhoff implies that if $m = n$, then the Kantorovich problem has a solution which also solves the Monge problem: optimal transport can be realized with a bijection $π: X \rightarrow Y$. This is impossible when $m \neq n$. We observe that when $m \neq n$, there exists a solution of the Kantorovich problem such that the mass of each point in $X$ is moved to at most $n/\gcd(m,n)$ different points in $Y$ and that, conversely, each point in $Y$ receives mass from at most $m/\gcd(m,n)$ points in $X$.
△ Less
Submitted 1 June, 2022; v1 submitted 5 May, 2022;
originally announced May 2022.
-
An Optimal Transport Formulation of Bayes' Law for Nonlinear Filtering Algorithms
Authors:
Amirhossein Taghvaei,
Bamdad Hosseini
Abstract:
This paper presents a variational representation of the Bayes' law using optimal transportation theory. The variational representation is in terms of the optimal transportation between the joint distribution of the (state, observation) and their independent coupling. By imposing certain structure on the transport map, the solution to the variational problem is used to construct a Brenier-type map…
▽ More
This paper presents a variational representation of the Bayes' law using optimal transportation theory. The variational representation is in terms of the optimal transportation between the joint distribution of the (state, observation) and their independent coupling. By imposing certain structure on the transport map, the solution to the variational problem is used to construct a Brenier-type map that transports the prior distribution to the posterior distribution for any value of the observation signal. The new formulation is used to derive the optimal transport form of the Ensemble Kalman filter (EnKF) for the discrete-time filtering problem and propose a novel extension of EnKF to the non-Gaussian setting utilizing input convex neural networks. Finally, the proposed methodology is used to derive the optimal transport form of the feedback particle filler (FPF) in the continuous-time limit, which constitutes its first variational construction without explicitly using the nonlinear filtering equation or Bayes' law.
△ Less
Submitted 13 September, 2022; v1 submitted 22 March, 2022;
originally announced March 2022.
-
Solving and Learning Nonlinear PDEs with Gaussian Processes
Authors:
Yifan Chen,
Bamdad Hosseini,
Houman Owhadi,
Andrew M Stuart
Abstract:
We introduce a simple, rigorous, and unified framework for solving nonlinear partial differential equations (PDEs), and for solving inverse problems (IPs) involving the identification of parameters in PDEs, using the framework of Gaussian processes. The proposed approach: (1) provides a natural generalization of collocation kernel methods to nonlinear PDEs and IPs; (2) has guaranteed convergence f…
▽ More
We introduce a simple, rigorous, and unified framework for solving nonlinear partial differential equations (PDEs), and for solving inverse problems (IPs) involving the identification of parameters in PDEs, using the framework of Gaussian processes. The proposed approach: (1) provides a natural generalization of collocation kernel methods to nonlinear PDEs and IPs; (2) has guaranteed convergence for a very general class of PDEs, and comes equipped with a path to compute error bounds for specific PDE approximations; (3) inherits the state-of-the-art computational complexity of linear solvers for dense kernel matrices. The main idea of our method is to approximate the solution of a given PDE as the maximum a posteriori (MAP) estimator of a Gaussian process conditioned on solving the PDE at a finite number of collocation points. Although this optimization problem is infinite-dimensional, it can be reduced to a finite-dimensional one by introducing additional variables corresponding to the values of the derivatives of the solution at collocation points; this generalizes the representer theorem arising in Gaussian process regression. The reduced optimization problem has the form of a quadratic objective function subject to nonlinear constraints; it is solved with a variant of the Gauss--Newton method. The resulting algorithm (a) can be interpreted as solving successive linearizations of the nonlinear PDE, and (b) in practice is found to converge in a small number of iterations (2 to 10), for a wide range of PDEs. Most traditional approaches to IPs interleave parameter updates with numerical solution of the PDE; our algorithm solves for both parameter and PDE solution simultaneously. Experiments on nonlinear elliptic PDEs, Burgers' equation, a regularized Eikonal equation, and an IP for permeability identification in Darcy flow illustrate the efficacy and scope of our framework.
△ Less
Submitted 10 August, 2021; v1 submitted 23 March, 2021;
originally announced March 2021.
-
Model Reduction and Neural Networks for Parametric PDEs
Authors:
Kaushik Bhattacharya,
Bamdad Hosseini,
Nikola B. Kovachki,
Andrew M. Stuart
Abstract:
We develop a general framework for data-driven approximation of input-output maps between infinite-dimensional spaces. The proposed approach is motivated by the recent successes of neural networks and deep learning, in combination with ideas from model reduction. This combination results in a neural network approximation which, in principle, is defined on infinite-dimensional spaces and, in practi…
▽ More
We develop a general framework for data-driven approximation of input-output maps between infinite-dimensional spaces. The proposed approach is motivated by the recent successes of neural networks and deep learning, in combination with ideas from model reduction. This combination results in a neural network approximation which, in principle, is defined on infinite-dimensional spaces and, in practice, is robust to the dimension of finite-dimensional approximations of these spaces required for computation. For a class of input-output maps, and suitably chosen probability measures on the inputs, we prove convergence of the proposed approximation methodology. We also include numerical experiments which demonstrate the effectiveness of the method, showing convergence and robustness of the approximation scheme with respect to the size of the discretization, and compare it with existing algorithms from the literature; our examples include the mapping from coefficient to solution in a divergence form elliptic partial differential equation (PDE) problem, and the solution operator for viscous Burgers' equation.
△ Less
Submitted 17 June, 2021; v1 submitted 6 May, 2020;
originally announced May 2020.
-
Spectral Analysis Of Weighted Laplacians Arising In Data Clustering
Authors:
Franca Hoffmann,
Bamdad Hosseini,
Assad A. Oberai,
Andrew M. Stuart
Abstract:
Graph Laplacians computed from weighted adjacency matrices are widely used to identify geometric structure in data, and clusters in particular; their spectral properties play a central role in a number of unsupervised and semi-supervised learning algorithms. When suitably scaled, graph Laplacians approach limiting continuum operators in the large data limit. Studying these limiting operators, ther…
▽ More
Graph Laplacians computed from weighted adjacency matrices are widely used to identify geometric structure in data, and clusters in particular; their spectral properties play a central role in a number of unsupervised and semi-supervised learning algorithms. When suitably scaled, graph Laplacians approach limiting continuum operators in the large data limit. Studying these limiting operators, therefore, sheds light on learning algorithms. This paper is devoted to the study of a parameterized family of divergence form elliptic operators that arise as the large data limit of graph Laplacians. The link between a three-parameter family of graph Laplacians and a three-parameter family of differential operators is explained. The spectral properties of these differential operators are analyzed in the situation where the data comprises two nearly separated clusters, in a sense which is made precise. In particular, we investigate how the spectral gap depends on the three parameters entering the graph Laplacian, and on a parameter measuring the size of the perturbation from the perfectly clustered case. Numerical results are presented which exemplify and extend the analysis: the computations study situations in which there are two nearly separated clusters, but which violate the assumptions used in our theory; situations in which more than two clusters are present, also going beyond our theory; and situations which demonstrate the relevance of our studies of differential operators for the understanding of finite data problems via the graph Laplacian. The findings provide insight into parameter choices made in learning algorithms which are based on weighted adjacency matrices; they also provide the basis for analysis of the consistency of various unsupervised and semi-supervised learning algorithms, in the large data limit.
△ Less
Submitted 13 July, 2020; v1 submitted 13 September, 2019;
originally announced September 2019.
-
Consistency of semi-supervised learning algorithms on graphs: Probit and one-hot methods
Authors:
Franca Hoffmann,
Bamdad Hosseini,
Zhi Ren,
Andrew M. Stuart
Abstract:
Graph-based semi-supervised learning is the problem of propagating labels from a small number of labelled data points to a larger set of unlabelled data. This paper is concerned with the consistency of optimization-based techniques for such problems, in the limit where the labels have small noise and the underlying unlabelled data is well clustered. We study graph-based probit for binary classific…
▽ More
Graph-based semi-supervised learning is the problem of propagating labels from a small number of labelled data points to a larger set of unlabelled data. This paper is concerned with the consistency of optimization-based techniques for such problems, in the limit where the labels have small noise and the underlying unlabelled data is well clustered. We study graph-based probit for binary classification, and a natural generalization of this method to multi-class classification using one-hot encoding. The resulting objective function to be optimized comprises the sum of a quadratic form defined through a rational function of the graph Laplacian, involving only the unlabelled data, and a fidelity term involving only the labelled data. The consistency analysis sheds light on the choice of the rational function defining the optimization.
△ Less
Submitted 9 March, 2020; v1 submitted 18 June, 2019;
originally announced June 2019.
-
Geometric structure of graph Laplacian embeddings
Authors:
Nicolas Garcia Trillos,
Franca Hoffmann,
Bamdad Hosseini
Abstract:
We analyze the spectral clustering procedure for identifying coarse structure in a data set $x_1, \dots, x_n$, and in particular study the geometry of graph Laplacian embeddings which form the basis for spectral clustering algorithms. More precisely, we assume that the data is sampled from a mixture model supported on a manifold $\mathcal{M}$ embedded in $\mathbb{R}^d$, and pick a connectivity len…
▽ More
We analyze the spectral clustering procedure for identifying coarse structure in a data set $x_1, \dots, x_n$, and in particular study the geometry of graph Laplacian embeddings which form the basis for spectral clustering algorithms. More precisely, we assume that the data is sampled from a mixture model supported on a manifold $\mathcal{M}$ embedded in $\mathbb{R}^d$, and pick a connectivity length-scale $\varepsilon>0$ to construct a kernelized graph Laplacian. We introduce a notion of a well-separated mixture model which only depends on the model itself, and prove that when the model is well separated, with high probability the embedded data set concentrates on cones that are centered around orthogonal vectors. Our results are meaningful in the regime where $\varepsilon = \varepsilon(n)$ is allowed to decay to zero at a slow enough rate as the number of data points grows. This rate depends on the intrinsic dimension of the manifold on which the data is supported.
△ Less
Submitted 29 January, 2019;
originally announced January 2019.
-
Spectral gaps and error estimates for infinite-dimensional Metropolis-Hastings with non-Gaussian priors
Authors:
Bamdad Hosseini,
James E Johndrow
Abstract:
We study a class of Metropolis-Hastings algorithms for target measures that are absolutely continuous with respect to a large class of non-Gaussian prior measures on Banach spaces. The algorithm is shown to have a spectral gap in a Wasserstein-like semimetric weighted by a Lyapunov function. A number of error bounds are given for computationally tractable approximations of the algorithm including…
▽ More
We study a class of Metropolis-Hastings algorithms for target measures that are absolutely continuous with respect to a large class of non-Gaussian prior measures on Banach spaces. The algorithm is shown to have a spectral gap in a Wasserstein-like semimetric weighted by a Lyapunov function. A number of error bounds are given for computationally tractable approximations of the algorithm including bounds on the closeness of Cesáro averages and other pathwise quantities via perturbation theory. Several applications illustrate the breadth of problems to which the results apply such as various likelihood approximations and perturbations of prior measures.
△ Less
Submitted 17 May, 2022; v1 submitted 29 September, 2018;
originally announced October 2018.
-
Simultaneous model calibration and source inversion in atmospheric dispersion models
Authors:
Juan G. Garcia,
Bamdad Hosseini,
John M Stockie
Abstract:
We present a cost-effective method for model calibration and solution of source inversion problems in atmospheric dispersion modelling. We use Gaussian process emulations of atmospheric dispersion models within a Bayesian framework for solution of inverse problems. The model and source parameters are treated as unknowns and we obtain point estimates and approximation of uncertainties for sources w…
▽ More
We present a cost-effective method for model calibration and solution of source inversion problems in atmospheric dispersion modelling. We use Gaussian process emulations of atmospheric dispersion models within a Bayesian framework for solution of inverse problems. The model and source parameters are treated as unknowns and we obtain point estimates and approximation of uncertainties for sources while simultaneously calibrating the forward model. The method is validated in the context of an industrial case study involving emissions from a smelting operation for which cumulative monthly measurements of zinc particulate depositions are available.
△ Less
Submitted 18 July, 2019; v1 submitted 14 June, 2018;
originally announced June 2018.
-
Two Metropolis-Hastings algorithms for posterior measures with non-Gaussian priors in infinite dimensions
Authors:
Bamdad Hosseini
Abstract:
We introduce two classes of Metropolis-Hastings algorithms for sampling target measures that are absolutely continuous with respect to non-Gaussian prior measures on infinite-dimensional Hilbert spaces. In particular, we focus on certain classes of prior measures for which prior-reversible proposal kernels of the autoregressive type can be designed. We then use these proposal kernels to design alg…
▽ More
We introduce two classes of Metropolis-Hastings algorithms for sampling target measures that are absolutely continuous with respect to non-Gaussian prior measures on infinite-dimensional Hilbert spaces. In particular, we focus on certain classes of prior measures for which prior-reversible proposal kernels of the autoregressive type can be designed. We then use these proposal kernels to design algorithms that satisfy detailed balance with respect to the target measures. Afterwards, we introduce a new class of prior measures, called the Bessel-K priors, as a generalization of the gamma distribution to measures in infinite dimensions. The Bessel-K priors interpolate between well-known priors such as the gamma distribution and Besov priors and can model sparse or compressible parameters. We present concrete instances of our algorithms for the Bessel-K priors in the context of numerical examples in density estimation, finite-dimensional denoising and deconvolution on the circle.
△ Less
Submitted 22 April, 2019; v1 submitted 20 April, 2018;
originally announced April 2018.
-
Well-posed Bayesian Inverse Problems with Infinitely-Divisible and Heavy-Tailed Prior Measures
Authors:
Bamdad Hosseini
Abstract:
We present a new class of prior measures in connection to $\ell_p$ regularization techniques when $p \in(0,1)$ which is based on the generalized Gamma distribution. We show that the resulting prior measure is heavy-tailed, non-convex and infinitely divisible. Motivated by this observation we discuss the class of infinitely divisible prior measures and draw a connection between their tail behavior…
▽ More
We present a new class of prior measures in connection to $\ell_p$ regularization techniques when $p \in(0,1)$ which is based on the generalized Gamma distribution. We show that the resulting prior measure is heavy-tailed, non-convex and infinitely divisible. Motivated by this observation we discuss the class of infinitely divisible prior measures and draw a connection between their tail behavior and the tail behavior of their L{évy} measures. Next, we use the laws of pure jump L{é}vy processes in order to define new classes of prior measures that are concentrated on the space of functions with bounded variation. These priors serve as an alternative to the classic total variation prior and result in well-defined inverse problems. We then study the well-posedness of Bayesian inverse problems in a general enough setting that encompasses the above mentioned classes of prior measures. We establish that well-posedness relies on a balance between the growth of the log-likelihood function and the tail behavior of the prior and apply our results to special cases such as additive noise models and linear problems. Finally, we discuss some of the practical aspects of Bayesian inverse problems such as their consistent approximation and present three concrete examples of well-posed Bayesian inverse problems with heavy-tailed or stochastic process prior measures.
△ Less
Submitted 21 February, 2017; v1 submitted 23 September, 2016;
originally announced September 2016.
-
Airborne contaminant source estimation using a finite-volume forward solver coupled with a Bayesian inversion approach
Authors:
Bamdad Hosseini,
John M. Stockie
Abstract:
We propose a numerical algorithm for solving the atmospheric dispersion problem with elevated point sources and ground-level deposition. The problem is modelled by the 3D advection-diffusion equation with delta-distribution source terms, as well as height-dependent advection speed and diffusion coefficients. We construct a finite volume scheme using a splitting approach in which the Clawpack softw…
▽ More
We propose a numerical algorithm for solving the atmospheric dispersion problem with elevated point sources and ground-level deposition. The problem is modelled by the 3D advection-diffusion equation with delta-distribution source terms, as well as height-dependent advection speed and diffusion coefficients. We construct a finite volume scheme using a splitting approach in which the Clawpack software package is used as the advection solver and an implicit time discretization is proposed for the diffusion terms. The algorithm is then applied to an actual industrial scenario involving emissions of airborne particulates from a zinc smelter using actual wind measurements. We also address various practical considerations such as choosing appropriate methods for regularizing noisy wind data and quantifying sensitivity of the model to parameter uncertainty. Afterwards, we use the algorithm within a Bayesian framework for estimating emission rates of zinc from multiple sources over the industrial site. We compare our finite volume solver with a Gaussian plume solver within the Bayesian framework and demonstrate that the finite volume solver results in tighter uncertainty bounds on the estimated emission rates.
△ Less
Submitted 19 September, 2016; v1 submitted 12 July, 2016;
originally announced July 2016.
-
Well-posed Bayesian Inverse Problems: Priors with Exponential Tails
Authors:
Bamdad Hosseini,
Nilima Nigam
Abstract:
We consider the well-posedness of Bayesian inverse problems when the prior measure has exponential tails. In particular, we consider the class of convex (log-concave) probability measures which include the Gaussian and Besov measures as well as certain classes of hierarchical priors. We identify appropriate conditions on the likelihood distribution and the prior measure which guarantee existence,…
▽ More
We consider the well-posedness of Bayesian inverse problems when the prior measure has exponential tails. In particular, we consider the class of convex (log-concave) probability measures which include the Gaussian and Besov measures as well as certain classes of hierarchical priors. We identify appropriate conditions on the likelihood distribution and the prior measure which guarantee existence, uniqueness and stability of the posterior measure with respect to perturbations of the data. We also consider consistent approximations of the posterior such as discretization by projection. Finally, we present a general recipe for construction of convex priors on Banach spaces which will be of interest in practical applications where one often works with spaces such as $L^2$ or the continuous functions.
△ Less
Submitted 23 February, 2017; v1 submitted 9 April, 2016;
originally announced April 2016.
-
On regularizations of the delta distribution
Authors:
Bamdad Hosseini,
Nilima Nigam,
John M. Stockie
Abstract:
In this article we consider regularizations of the Dirac delta distribution with applications to prototypical elliptic and hyperbolic partial differential equations (PDEs). We study the convergence of a sequence of distributions $\mathcal{S}_H$ to a singular term $\mathcal{S}$ as a parameter $H$ (associated with the {support size} of $\mathcal{S}_H$) shrinks to zero. We characterize this convergen…
▽ More
In this article we consider regularizations of the Dirac delta distribution with applications to prototypical elliptic and hyperbolic partial differential equations (PDEs). We study the convergence of a sequence of distributions $\mathcal{S}_H$ to a singular term $\mathcal{S}$ as a parameter $H$ (associated with the {support size} of $\mathcal{S}_H$) shrinks to zero. We characterize this convergence in both the weak-$\ast$ topology of distributions, as well as in a weighted Sobolev norm. These notions motivate a framework for constructing regularizations of the delta distribution that includes a large class of existing methods in the literature. This framework allows different regularizations to be compared. The convergence of solutions of PDEs with these regularized source terms is then studied in various topologies such as pointwise convergence on a deleted neighborhood and weighted Sobolev norms. We also examine the lack of symmetry in tensor product regularizations and effects of dissipative error in hyperbolic problems.
△ Less
Submitted 22 May, 2015; v1 submitted 12 December, 2014;
originally announced December 2014.