-
An optimal experimental design approach to sensor placement in continuous stochastic filtering
Authors:
Sahani Pathiraja,
Claudia Schillings,
Philipp Wacker
Abstract:
Sequential filtering and spatial inverse problems assimilate data points distributed either temporally (in the case of filtering) or spatially (in the case of spatial inverse problems). Sometimes it is possible to choose the position of these data points (which we call sensors here) in advance, with the goal of maximising the expected information gain (or a different metric of performance) from fu…
▽ More
Sequential filtering and spatial inverse problems assimilate data points distributed either temporally (in the case of filtering) or spatially (in the case of spatial inverse problems). Sometimes it is possible to choose the position of these data points (which we call sensors here) in advance, with the goal of maximising the expected information gain (or a different metric of performance) from future data, and this leads to an Optimal Experimental Design (OED) problem. Here we revisit an interpretation of optimising sensor placement as an integration with respect to a general probability measure $ξ$. This generalises the problem of discrete-time sensor placement (which corresponds to the special case where the probability measure is a mixture of Diracs) to an infinite-dimensional, but mathematically more well-behaved setting. We focus on the continuous-time stochastic filtering setting, whose solution is governed by the Zakai equation. We derive an expression for the Fréchet derivative of a general OED utility functional, the key to which is an adjoint (backwards in time) differential equation. This paves the way for utilising new gradient-based methods for solving the corresponding optimisation problem, as a potentially more efficient alternative to (semi-)discrete optimisation methods, e.g. based on greedy insertion and deletion of sensor placements.
△ Less
Submitted 17 August, 2025;
originally announced August 2025.
-
Sequential Monte Carlo approximations of Wasserstein--Fisher--Rao gradient flows
Authors:
Francesca R. Crucinio,
Sahani Pathiraja
Abstract:
We consider the problem of sampling from a probability distribution $π$. It is well known that this can be written as an optimisation problem over the space of probability distribution in which we aim to minimise the Kullback--Leibler divergence from $π$. We consider several partial differential equations (PDEs) whose solution is a minimiser of the Kullback--Leibler divergence from $π$ and connect…
▽ More
We consider the problem of sampling from a probability distribution $π$. It is well known that this can be written as an optimisation problem over the space of probability distribution in which we aim to minimise the Kullback--Leibler divergence from $π$. We consider several partial differential equations (PDEs) whose solution is a minimiser of the Kullback--Leibler divergence from $π$ and connect them to well-known Monte Carlo algorithms. We focus in particular on PDEs obtained by considering the Wasserstein--Fisher--Rao geometry over the space of probabilities and show that these lead to a natural implementation using importance sampling and sequential Monte Carlo. We propose a novel algorithm to approximate the Wasserstein--Fisher--Rao flow of the Kullback--Leibler divergence which empirically outperforms the current state-of-the-art.
We study tempered versions of these PDEs obtained by replacing the target distribution with a geometric mixture of initial and target distribution and show that these do not lead to a convergence speed up.
△ Less
Submitted 6 June, 2025;
originally announced June 2025.
-
Mathematical description of continuous time and space replicator-mutator equations for quadratic fitness landscapes
Authors:
Sahani Pathiraja,
Philipp Wacker
Abstract:
The replicator-mutator equation is a model for populations of individuals carrying different traits, with a fitness function mediating their ability to replicate, and a stochastic model for mutation. We derive analytical solutions for the replicator-mutator equation in continuous time and for continuous traits for a quadratic fitness function. Using these results we can explain and quantify (witho…
▽ More
The replicator-mutator equation is a model for populations of individuals carrying different traits, with a fitness function mediating their ability to replicate, and a stochastic model for mutation. We derive analytical solutions for the replicator-mutator equation in continuous time and for continuous traits for a quadratic fitness function. Using these results we can explain and quantify (without the need for numerical in-silico simulations) a series of evolutionary phenomena, in particular the flying kite effect, survival of the flattest, and the ability of a population to sustain itself while tracking an optimal feature which may be fixed, moving with bounded velocity in trait space, oscillating, or randomly fluctuating.
△ Less
Submitted 12 December, 2024; v1 submitted 11 December, 2024;
originally announced December 2024.
-
Connections between sequential Bayesian inference and evolutionary dynamics
Authors:
Sahani Pathiraja,
Philipp Wacker
Abstract:
It has long been posited that there is a connection between the dynamical equations describing evolutionary processes in biology and sequential Bayesian learning methods. This manuscript describes new research in which this precise connection is rigorously established in the continuous time setting. Here we focus on a partial differential equation known as the Kushner-Stratonovich equation describ…
▽ More
It has long been posited that there is a connection between the dynamical equations describing evolutionary processes in biology and sequential Bayesian learning methods. This manuscript describes new research in which this precise connection is rigorously established in the continuous time setting. Here we focus on a partial differential equation known as the Kushner-Stratonovich equation describing the evolution of the posterior density in time. Of particular importance is a piecewise smooth approximation of the observation path from which the discrete time filtering equations, which are shown to converge to a Stratonovich interpretation of the Kushner-Stratonovich equation. This smooth formulation will then be used to draw precise connections between nonlinear stochastic filtering and replicator-mutator dynamics. Additionally, gradient flow formulations will be investigated as well as a form of replicator-mutator dynamics which is shown to be beneficial for the misspecified model filtering problem. It is hoped this work will spur further research into exchanges between sequential learning and evolutionary biology and to inspire new algorithms in filtering and sampling.
△ Less
Submitted 12 March, 2025; v1 submitted 25 November, 2024;
originally announced November 2024.
-
PULASki: Learning inter-rater variability using statistical distances to improve probabilistic segmentation
Authors:
Soumick Chatterjee,
Franziska Gaidzik,
Alessandro Sciarra,
Hendrik Mattern,
Gábor Janiga,
Oliver Speck,
Andreas Nürnberger,
Sahani Pathiraja
Abstract:
In the domain of medical imaging, many supervised learning based methods for segmentation face several challenges such as high variability in annotations from multiple experts, paucity of labelled data and class imbalanced datasets. These issues may result in segmentations that lack the requisite precision for clinical analysis and can be misleadingly overconfident without associated uncertainty q…
▽ More
In the domain of medical imaging, many supervised learning based methods for segmentation face several challenges such as high variability in annotations from multiple experts, paucity of labelled data and class imbalanced datasets. These issues may result in segmentations that lack the requisite precision for clinical analysis and can be misleadingly overconfident without associated uncertainty quantification. This work proposes the PULASki method as a computationally efficient generative tool for biomedical image segmentation that accurately captures variability in expert annotations, even in small datasets. This approach makes use of an improved loss function based on statistical distances in a conditional variational autoencoder structure (Probabilistic UNet), which improves learning of the conditional decoder compared to the standard cross-entropy particularly in class imbalanced problems. The proposed method was analysed for two structurally different segmentation tasks (intracranial vessel and multiple sclerosis (MS) lesion) and compare our results to four well-established baselines in terms of quantitative metrics and qualitative output. These experiments involve class-imbalanced datasets characterised by challenging features, including suboptimal signal-to-noise ratios and high ambiguity. Empirical results demonstrate the PULASKi method outperforms all baselines at the 5\% significance level. Our experiments are also of the first to present a comparative study of the computationally feasible segmentation of complex geometries using 3D patches and the traditional use of 2D slices. The generated segmentations are shown to be much more anatomically plausible than in the 2D case, particularly for the vessel task.
△ Less
Submitted 18 March, 2025; v1 submitted 25 December, 2023;
originally announced December 2023.
-
Analysis of the feedback particle filter with diffusion map based approximation of the gain
Authors:
Sahani Pathiraja,
Wilhelm Stannat
Abstract:
Control-type particle filters have been receiving increasing attention over the last decade as a means of obtaining sample based approximations to the sequential Bayesian filtering problem in the nonlinear setting. Here we analyse one such type, namely the feedback particle filter and a recently proposed approximation of the associated gain function based on diffusion maps. The key purpose is to p…
▽ More
Control-type particle filters have been receiving increasing attention over the last decade as a means of obtaining sample based approximations to the sequential Bayesian filtering problem in the nonlinear setting. Here we analyse one such type, namely the feedback particle filter and a recently proposed approximation of the associated gain function based on diffusion maps. The key purpose is to provide analytic insights on the form of the approximate gain, which are of interest in their own right. These are then used to establish a roadmap to obtaining well-posedness and convergence of the finite $N$ system to its mean field limit. A number of possible future research directions are also discussed.
△ Less
Submitted 17 November, 2021; v1 submitted 6 September, 2021;
originally announced September 2021.
-
L2 convergence of smooth approximations of Stochastic Differential Equations with unbounded coefficients
Authors:
Sahani Pathiraja
Abstract:
The aim of this paper is to obtain convergence in mean in the uniform topology of piecewise linear approximations of Stochastic Differential Equations (SDEs) with $C^1$ drift and $C^2$ diffusion coefficients with uniformly bounded derivatives. Convergence analyses for such Wong-Zakai approximations most often assume that the coefficients of the SDE are uniformly bounded. Almost sure convergence in…
▽ More
The aim of this paper is to obtain convergence in mean in the uniform topology of piecewise linear approximations of Stochastic Differential Equations (SDEs) with $C^1$ drift and $C^2$ diffusion coefficients with uniformly bounded derivatives. Convergence analyses for such Wong-Zakai approximations most often assume that the coefficients of the SDE are uniformly bounded. Almost sure convergence in the unbounded case can be obtained using now standard rough path techniques, although $L^q$ convergence appears yet to be established and is of importance for several applications involving Monte-Carlo approximations. We consider $L^2$ convergence in the unbounded case using a combination of traditional stochastic analysis and rough path techniques. We expect our proof technique extend to more general piecewise smooth approximations.
△ Less
Submitted 25 November, 2020;
originally announced November 2020.
-
McKean-Vlasov SDEs in nonlinear filtering
Authors:
Sahani Pathiraja,
Sebastian Reich,
Wilhelm Stannat
Abstract:
Various particle filters have been proposed over the last couple of decades with the common feature that the update step is governed by a type of control law. This feature makes them an attractive alternative to traditional sequential Monte Carlo which scales poorly with the state dimension due to weight degeneracy. This article proposes a unifying framework that allows to systematically derive th…
▽ More
Various particle filters have been proposed over the last couple of decades with the common feature that the update step is governed by a type of control law. This feature makes them an attractive alternative to traditional sequential Monte Carlo which scales poorly with the state dimension due to weight degeneracy. This article proposes a unifying framework that allows to systematically derive the McKean-Vlasov representations of these filters for the discrete time and continuous time observation case, taking inspiration from the smooth approximation of the data considered in Crisan & Xiong (2010) and Clark & Crisan (2005). We consider three filters that have been proposed in the literature and use this framework to derive Itô representations of their limiting forms as the approximation parameter $δ\rightarrow 0$. All filters require the solution of a Poisson equation defined on $\mathbb{R}^{d}$, for which existence and uniqueness of solutions can be a non-trivial issue. We additionally establish conditions on the signal-observation system that ensures well-posedness of the weighted Poisson equation arising in one of the filters.
△ Less
Submitted 17 November, 2021; v1 submitted 24 July, 2020;
originally announced July 2020.
-
Discrete gradients for computational Bayesian inference
Authors:
Sahani Pathiraja,
Sebastian Reich
Abstract:
In this paper, we exploit the gradient flow structure of continuous-time formulations of Bayesian inference in terms of their numerical time-stepping. We focus on two particular examples, namely, the continuous-time ensemble Kalman-Bucy filter and a particle discretisation of the Fokker-Planck equation associated to Brownian dynamics. Both formulations can lead to stiff differential equations whic…
▽ More
In this paper, we exploit the gradient flow structure of continuous-time formulations of Bayesian inference in terms of their numerical time-stepping. We focus on two particular examples, namely, the continuous-time ensemble Kalman-Bucy filter and a particle discretisation of the Fokker-Planck equation associated to Brownian dynamics. Both formulations can lead to stiff differential equations which require special numerical methods for their efficient numerical implementation. We compare discrete gradient methods to alternative semi-implicit and other iterative implementations of the underlying Bayesian inference problems.
△ Less
Submitted 21 June, 2019; v1 submitted 1 March, 2019;
originally announced March 2019.
-
Ensemble transform algorithms for nonlinear smoothing problems
Authors:
Jana de Wiljes,
Sahani Pathiraja,
Sebastian Reich
Abstract:
Several numerical tools designed to overcome the challenges of smoothing in a nonlinear and non-Gaussian setting are investigated for a class of particle smoothers. The considered family of smoothers is induced by the class of linear ensemble transform filters which contains classical filters such as the stochastic ensemble Kalman filter, the ensemble square root filter and the recently introduced…
▽ More
Several numerical tools designed to overcome the challenges of smoothing in a nonlinear and non-Gaussian setting are investigated for a class of particle smoothers. The considered family of smoothers is induced by the class of linear ensemble transform filters which contains classical filters such as the stochastic ensemble Kalman filter, the ensemble square root filter and the recently introduced nonlinear ensemble transform filter. Further the ensemble transform particle smoother is introduced and particularly highlighted as it is consistent in the particle limit and does not require assumptions with respect to the family of the posterior distribution. The linear update pattern of the considered class of linear ensemble transform smoothers allows one to implement important supplementary techniques such as adaptive spread corrections, hybrid formulations, and localization in order to facilitate their application to complex estimation problems. These additional features are derived and numerically investigated for a sequence of increasingly challenging test problems.
△ Less
Submitted 28 October, 2019; v1 submitted 18 January, 2019;
originally announced January 2019.
-
Multiplicative non-Gaussian model error estimation in data assimilation
Authors:
Sahani Pathiraja,
Peter Jan van Leeuwen
Abstract:
Model uncertainty quantification is an essential component of effective data assimilation. Model errors associated with sub-grid scale processes are often represented through stochastic parameterizations of the unresolved process. Many existing Stochastic Parameterization schemes are only applicable when knowledge of the true sub-grid scale process or full observations of the coarse scale process…
▽ More
Model uncertainty quantification is an essential component of effective data assimilation. Model errors associated with sub-grid scale processes are often represented through stochastic parameterizations of the unresolved process. Many existing Stochastic Parameterization schemes are only applicable when knowledge of the true sub-grid scale process or full observations of the coarse scale process are available, which is typically not the case in real applications. We present a methodology for estimating the statistics of sub-grid scale processes for the more realistic case that only partial observations of the coarse scale process are available. Model error realizations are estimated over a training period by minimizing their conditional sum of squared deviations given some informative covariates (e.g. state of the system), constrained by available observations and assuming that the observation errors are smaller than the model errors. From these realizations a conditional probability distribution of additive model errors given these covariates is obtained, allowing for complex non-Gaussian error structures. Random draws from this density are then used in actual ensemble data assimilation experiments. We demonstrate the efficacy of the approach through numerical experiments with the multi-scale Lorenz 96 system using both small and large time scale separations between slow (coarse scale) and fast (fine scale) variables. The resulting error estimates and forecasts obtained with this new method are superior to those from two existing methods.
△ Less
Submitted 10 April, 2021; v1 submitted 24 July, 2018;
originally announced July 2018.
-
Perturbations and projections of Kalman-Bucy semigroups
Authors:
Adrian N. Bishop,
Pierre Del Moral,
Sahani D. Pathiraja
Abstract:
We analyse various perturbations and projections of Kalman-Bucy semigroups and Riccati equations. For example, covariance inflation-type perturbations and localisation methods (projections) are common in the ensemble Kalman filtering literature. In the limit of these ensemble methods, the regularised sample covariance tends toward a solution of a perturbed/projected Riccati equation. With this mot…
▽ More
We analyse various perturbations and projections of Kalman-Bucy semigroups and Riccati equations. For example, covariance inflation-type perturbations and localisation methods (projections) are common in the ensemble Kalman filtering literature. In the limit of these ensemble methods, the regularised sample covariance tends toward a solution of a perturbed/projected Riccati equation. With this motivation, results are given characterising the error between the nominal and regularised Riccati flows and Kalman-Bucy filtering distributions. New projection-type models are also discussed; e.g. Bose-Mesner projections. These regularisation models are also of interest on their own, and in, e.g., differential games, control of stochastic/jump processes, and robust control.
△ Less
Submitted 1 December, 2018; v1 submitted 20 January, 2017;
originally announced January 2017.