Search | arXiv e-print repository

Local Sequential MCMC for Data Assimilation with Applications in Geoscience

Abstract: This paper presents a new data assimilation (DA) scheme based on a sequential Markov Chain Monte Carlo (SMCMC) DA technique [Ruzayqat et al. 2024] which is provably convergent and has been recently used for filtering, particularly for high-dimensional non-linear, and potentially, non-Gaussian state-space models. Unlike particle filters, which can be considered exact methods and can be used for fil… ▽ More This paper presents a new data assimilation (DA) scheme based on a sequential Markov Chain Monte Carlo (SMCMC) DA technique [Ruzayqat et al. 2024] which is provably convergent and has been recently used for filtering, particularly for high-dimensional non-linear, and potentially, non-Gaussian state-space models. Unlike particle filters, which can be considered exact methods and can be used for filtering non-linear, non-Gaussian models, SMCMC does not assign weights to the samples/particles, and therefore, the method does not suffer from the issue of weight-degeneracy when a relatively small number of samples is used. We design a localization approach within the SMCMC framework that focuses on regions where observations are located and restricts the transition densities included in the filtering distribution of the state to these regions. This results in immensely reducing the effective degrees of freedom and thus improving the efficiency. We test the new technique on high-dimensional ($d \sim 10^4 - 10^5$) linear Gaussian model and non-linear shallow water models with Gaussian noise with real and synthetic observations. For two of the numerical examples, the observations mimic the data generated by the Surface Water and Ocean Topography (SWOT) mission led by NASA, which is a swath of ocean height observations that changes location at every assimilation time step. We also use a set of ocean drifters' real observations in which the drifters are moving according the ocean kinematics and assumed to have uncertain locations at the time of assimilation. We show that when higher accuracy is required, the proposed algorithm is superior in terms of efficiency and accuracy over competing ensemble methods and the original SMCMC filter. △ Less

Submitted 11 September, 2024; originally announced September 2024.

Comments: 24 pages, 8 figures

MSC Class: 62M20; 60G35; 60J20; 94A12; 93E11; 65C40

arXiv:2311.09875 [pdf, other]

Unbiased and Multilevel Methods for a Class of Diffusions Partially Observed via Marked Point Processes

Authors: Miguel Alvarez, Ajay Jasra, Hamza Ruzayqat

Abstract: In this article we consider the filtering problem associated to partially observed diffusions, with observations following a marked point process. In the model, the data form a point process with observation times that have its intensity driven by a diffusion, with the associated marks also depending upon the diffusion process. We assume that one must resort to time-discretizing the diffusion proc… ▽ More In this article we consider the filtering problem associated to partially observed diffusions, with observations following a marked point process. In the model, the data form a point process with observation times that have its intensity driven by a diffusion, with the associated marks also depending upon the diffusion process. We assume that one must resort to time-discretizing the diffusion process and develop particle and multilevel particle filters to recursively approximate the filter. In particular, we prove that our multilevel particle filter can achieve a mean square error (MSE) of $\mathcal{O}(ε^2)$ ($ε>0$ and arbitrary) with a cost of $\mathcal{O}(ε^{-2.5})$ versus using a particle filter which has a cost of $\mathcal{O}(ε^{-3})$ to achieve the same MSE. We then show how this methodology can be extended to give unbiased (that is with no time-discretization error) estimators of the filter, which are proved to have finite variance and with high-probability have finite cost. Finally, we extend our methodology to the problem of online static-parameter estimation. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: 20 pages, 12 figures

MSC Class: 60G55; 60G35; 62M20; 62F30

arXiv:2310.03114 [pdf, other]

Bayesian Parameter Inference for Partially Observed Stochastic Volterra Equations

Authors: Ajay Jasra, Hamza Ruzayqat, Amin Wu

Abstract: In this article we consider Bayesian parameter inference for a type of partially observed stochastic Volterra equation (SVE). SVEs are found in many areas such as physics and mathematical finance. In the latter field they can be used to represent long memory in unobserved volatility processes. In many cases of practical interest, SVEs must be time-discretized and then parameter inference is based… ▽ More In this article we consider Bayesian parameter inference for a type of partially observed stochastic Volterra equation (SVE). SVEs are found in many areas such as physics and mathematical finance. In the latter field they can be used to represent long memory in unobserved volatility processes. In many cases of practical interest, SVEs must be time-discretized and then parameter inference is based upon the posterior associated to this time-discretized process. Based upon recent studies on time-discretization of SVEs (e.g. Richard et al. 2021), we use Euler-Maruyama methods for the afore-mentioned discretization. We then show how multilevel Markov chain Monte Carlo (MCMC) methods (Jasra et al. 2018) can be applied in this context. In the examples we study, we give a proof that shows that the cost to achieve a mean square error (MSE) of $\mathcal{O}(ε^2)$, $ε>0$, is {$\mathcal{O}(ε^{-\tfrac{4}{2H+1}})$, where $H$ is the Hurst parameter. If one uses a single level MCMC method then the cost is $\mathcal{O}(ε^{-\tfrac{2(2H+3)}{2H+1}})$} to achieve the same MSE. We illustrate these results in the context of state-space and stochastic volatility models, with the latter applied to real data. △ Less

Submitted 19 February, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: 17 pages, 5 figures

MSC Class: 62F15; 62M20; 60J10; 60J22; 65C40

arXiv:2309.13557 [pdf, other]

Bayesian Parameter Inference for Partially Observed Diffusions using Multilevel Stochastic Runge-Kutta Methods

Authors: Pierre Del Moral, Shulan Hu, Ajay Jasra, Hamza Ruzayqat, Xinyu Wang

Abstract: We consider the problem of Bayesian estimation of static parameters associated to a partially and discretely observed diffusion process. We assume that the exact transition dynamics of the diffusion process are unavailable, even up-to an unbiased estimator and that one must time-discretize the diffusion process. In such scenarios it has been shown how one can introduce the multilevel Monte Carlo m… ▽ More We consider the problem of Bayesian estimation of static parameters associated to a partially and discretely observed diffusion process. We assume that the exact transition dynamics of the diffusion process are unavailable, even up-to an unbiased estimator and that one must time-discretize the diffusion process. In such scenarios it has been shown how one can introduce the multilevel Monte Carlo method to reduce the cost to compute posterior expected values of the parameters for a pre-specified mean square error (MSE). These afore-mentioned methods rely on upon the Euler-Maruyama discretization scheme which is well-known in numerical analysis to have slow convergence properties. We adapt stochastic Runge-Kutta (SRK) methods for Bayesian parameter estimation of static parameters for diffusions. This can be implemented in high-dimensions of the diffusion and seemingly under-appreciated in the uncertainty quantification and statistics fields. For a class of diffusions and SRK methods, we consider the estimation of the posterior expectation of the parameters. We prove that to achieve a MSE of $\mathcal{O}(ε^2)$, for $ε>0$ given, the associated work is $\mathcal{O}(ε^{-2})$. Whilst the latter is achievable for the Milstein scheme, this method is often not applicable for diffusions in dimension larger than two. We also illustrate our methodology in several numerical examples. △ Less

Submitted 24 September, 2023; originally announced September 2023.

arXiv:2309.10589 [pdf, other]

Unbiased Parameter Estimation for Partially Observed Diffusions

Authors: Elsiddig Awadelkarim, Ajay Jasra, Hamza Ruzayqat

Abstract: In this article we consider the estimation of static parameters for partially observed diffusion process with discrete-time observations over a fixed time interval. In particular, we assume that one must time-discretize the partially observed diffusion process and work with the model with bias and consider maximizing the resulting log-likelihood. Using a novel double randomization scheme, based up… ▽ More In this article we consider the estimation of static parameters for partially observed diffusion process with discrete-time observations over a fixed time interval. In particular, we assume that one must time-discretize the partially observed diffusion process and work with the model with bias and consider maximizing the resulting log-likelihood. Using a novel double randomization scheme, based upon Markovian stochastic approximation we develop a new method to unbiasedly estimate the static parameters, that is, to obtain the maximum likelihood estimator with no time discretization bias. Under assumptions we prove that our estimator is unbiased and investigate the method in several numerical examples, showing that it can empirically out-perform existing unbiased methodology. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: 27 pages, 8 figures

MSC Class: 60J22; 62M05; 65C40; 62M20

arXiv:2305.00484 [pdf, other]

Sequential Markov Chain Monte Carlo for Lagrangian Data Assimilation with Applications to Unknown Data Locations

Authors: Hamza Ruzayqat, Alexandros Beskos, Dan Crisan, Ajay Jasra, Nikolas Kantas

Abstract: We consider a class of high-dimensional spatial filtering problems, where the spatial locations of observations are unknown and driven by the partially observed hidden signal. This problem is exceptionally challenging as not only is high-dimensional, but the model for the signal yields longer-range time dependencies through the observation locations. Motivated by this model we revisit a lesser-kno… ▽ More We consider a class of high-dimensional spatial filtering problems, where the spatial locations of observations are unknown and driven by the partially observed hidden signal. This problem is exceptionally challenging as not only is high-dimensional, but the model for the signal yields longer-range time dependencies through the observation locations. Motivated by this model we revisit a lesser-known and \emph{provably convergent} computational methodology from \cite{berzuini, cent, martin} that uses sequential Markov Chain Monte Carlo (MCMC) chains. We extend this methodology for data filtering problems with unknown observation locations. We benchmark our algorithms on Linear Gaussian state space models against competing ensemble methods and demonstrate a significant improvement in both execution speed and accuracy. Finally, we implement a realistic case study on a high-dimensional rotating shallow water model (of about $10^4-10^5$ dimensions) with real and synthetic data. The data is provided by the National Oceanic and Atmospheric Administration (NOAA) and contains observations from ocean drifters in a domain of the Atlantic Ocean restricted to the longitude and latitude intervals $[-51^{\circ}, -41^{\circ}]$, $[17^{\circ}, 27^{\circ}]$ respectively. △ Less

Submitted 5 March, 2024; v1 submitted 30 April, 2023; originally announced May 2023.

Comments: 31 pages, 23 figures, 1 table

MSC Class: 62M20; 60G35; 60J20; 94A12; 93E11; 65C40

arXiv:2206.07202 [pdf, other]

Unbiased Estimation using Underdamped Langevin Dynamics

Authors: Hamza Ruzayqat, Neil K. Chada, Ajay Jasra

Abstract: In this work we consider the unbiased estimation of expectations w.r.t.~probability measures that have non-negative Lebesgue density, and which are known point-wise up-to a normalizing constant. We focus upon developing an unbiased method via the underdamped Langevin dynamics, which has proven to be popular of late due to applications in statistics and machine learning. Specifically in continuous-… ▽ More In this work we consider the unbiased estimation of expectations w.r.t.~probability measures that have non-negative Lebesgue density, and which are known point-wise up-to a normalizing constant. We focus upon developing an unbiased method via the underdamped Langevin dynamics, which has proven to be popular of late due to applications in statistics and machine learning. Specifically in continuous-time, the dynamics can be constructed {so that as the time goes to infinity they} admit the probability of interest as a stationary measure. {In many cases, time-discretized versions of the underdamped Langevin dynamics are used in practice which are run only with a fixed number of iterations.} We develop a novel scheme based upon doubly randomized estimation as in \cite{ub_grad,disc_model}, which requires access only to time-discretized versions of the dynamics. {The proposed scheme aims to remove the dicretization bias and the bias resulting from running the dynamics for a finite number of iterations}. We prove, under standard assumptions, that our estimator is of finite variance and either has finite expected cost, or has finite cost with a high probability. To illustrate our theoretical findings we provide numerical experiments which verify our theory, which include challenging examples from Bayesian statistics and statistical physics. △ Less

Submitted 15 August, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

Comments: 27 pages, 13 figures

MSC Class: 60J22; 65C05; 65C40; 82C31; 62G08; 35Q56

arXiv:2203.03013 [pdf, other]

doi 10.1016/j.jcp.2022.111643

Unbiased Estimation using a Class of Diffusion Processes

Authors: Hamza Ruzayqat, Alexandros Beskos, Dan Crisan, Ajay Jasra, Nikolas Kantas

Abstract: We study the problem of unbiased estimation of expectations with respect to (w.r.t.) $π$ a given, general probability measure on $(\mathbb{R}^d,\mathcal{B}(\mathbb{R}^d))$ that is absolutely continuous with respect to a standard Gaussian measure. We focus on simulation associated to a particular class of diffusion processes, sometimes termed the Schrödinger-Föllmer Sampler, which is a simulation t… ▽ More We study the problem of unbiased estimation of expectations with respect to (w.r.t.) $π$ a given, general probability measure on $(\mathbb{R}^d,\mathcal{B}(\mathbb{R}^d))$ that is absolutely continuous with respect to a standard Gaussian measure. We focus on simulation associated to a particular class of diffusion processes, sometimes termed the Schrödinger-Föllmer Sampler, which is a simulation technique that approximates the law of a particular diffusion bridge process $\{X_t\}_{t\in [0,1]}$ on $\mathbb{R}^d$, $d\in \mathbb{N}_0$. This latter process is constructed such that, starting at $X_0=0$, one has $X_1\sim π$. Typically, the drift of the diffusion is intractable and, even if it were not, exact sampling of the associated diffusion is not possible. As a result, \cite{sf_orig,jiao} consider a stochastic Euler-Maruyama scheme that allows the development of biased estimators for expectations w.r.t.~$π$. We show that for this methodology to achieve a mean square error of $\mathcal{O}(ε^2)$, for arbitrary $ε>0$, the associated cost is $\mathcal{O}(ε^{-5})$. We then introduce an alternative approach that provides unbiased estimates of expectations w.r.t.~$π$, that is, it does not suffer from the time discretization bias or the bias related with the approximation of the drift function. We prove that to achieve a mean square error of $\mathcal{O}(ε^2)$, the associated cost is, with high probability, $\mathcal{O}(ε^{-2}|\log(ε)|^{2+δ})$, for any $δ>0$. We implement our method on several examples including Bayesian inverse problems. △ Less

Submitted 19 September, 2022; v1 submitted 6 March, 2022; originally announced March 2022.

Comments: 27 pages, 11 figures

MSC Class: 60J60; 62D05; 65C40

arXiv:2112.13874 [pdf, other]

Unbiased Parameter Inference for a Class of Partially Observed Levy-Process Models

Authors: Hamza Ruzayqat, Ajay Jasra

Abstract: We consider the problem of static Bayesian inference for partially observed Levy-process models. We develop a methodology which allows one to infer static parameters and some states of the process, without a bias from the time-discretization of the afore-mentioned Levy process. The unbiased method is exceptionally amenable to parallel implementation and can be computationally efficient relative to… ▽ More We consider the problem of static Bayesian inference for partially observed Levy-process models. We develop a methodology which allows one to infer static parameters and some states of the process, without a bias from the time-discretization of the afore-mentioned Levy process. The unbiased method is exceptionally amenable to parallel implementation and can be computationally efficient relative to competing approaches. We implement the method on S & P 500 log-return daily data and compare it to some Markov chain Monte Carlo (MCMC) algorithm. △ Less

Submitted 31 March, 2022; v1 submitted 27 December, 2021; originally announced December 2021.

Comments: 24 pages, 2 figures, 1 table

MSC Class: 65C05; 60H35; 62M05; 62M20; 62F15

arXiv:2110.00884 [pdf, other]

A Lagged Particle Filter for Stable Filtering of certain High-Dimensional State-Space Models

Authors: Hamza Ruzayqat, Aimad Er-Raiy, Alexandros Beskos, Dan Crisan, Ajay Jasra, Nikolas Kantas

Abstract: We consider the problem of high-dimensional filtering of state-space models (SSMs) at discrete times. This problem is particularly challenging as analytical solutions are typically not available and many numerical approximation methods can have a cost that scales exponentially with the dimension of the hidden state. Inspired by lag-approximation methods for the smoothing problem, we introduce a la… ▽ More We consider the problem of high-dimensional filtering of state-space models (SSMs) at discrete times. This problem is particularly challenging as analytical solutions are typically not available and many numerical approximation methods can have a cost that scales exponentially with the dimension of the hidden state. Inspired by lag-approximation methods for the smoothing problem, we introduce a lagged approximation of the smoothing distribution that is necessarily biased. For certain classes of SSMs, particularly those that forget the initial condition exponentially fast in time, the bias of our approximation is shown to be uniformly controlled in the dimension and exponentially small in time. We develop a sequential Monte Carlo (SMC) method to recursively estimate expectations with respect to our biased filtering distributions. Moreover, we prove for a class of class of SSMs that can contain dependencies amongst coordinates that as the dimension $d\rightarrow\infty$ the cost to achieve a stable mean square error in estimation, for classes of expectations, is of $\mathcal{O}(Nd^2)$ per-unit time, where $N$ is the number of simulated samples in the SMC algorithm. Our methodology is implemented on several challenging high-dimensional examples including the conservative shallow-water model. △ Less

Submitted 12 January, 2022; v1 submitted 2 October, 2021; originally announced October 2021.

Comments: 32 pages, 14 figures

MSC Class: 62M20; 60G35; 60J20; 60J10; 94A12; 93E11

arXiv:2108.03935 [pdf, other]

Multilevel Estimation of Normalization Constants Using the Ensemble Kalman-Bucy Filter

Authors: Hamza Ruzayqat, Neil K. Chada, Ajay Jasra

Abstract: In this article we consider the application of multilevel Monte Carlo, for the estimation of normalizing constants. In particular we will make use of the filtering algorithm, the ensemble Kalman-Bucy filter (EnKBF), which is an N-particle representation of the Kalma-Bucy filter (KBF). The EnKBF is of interest as it coincides with the optimal filter in the continuous-linear setting, i.e. the KBF. T… ▽ More In this article we consider the application of multilevel Monte Carlo, for the estimation of normalizing constants. In particular we will make use of the filtering algorithm, the ensemble Kalman-Bucy filter (EnKBF), which is an N-particle representation of the Kalma-Bucy filter (KBF). The EnKBF is of interest as it coincides with the optimal filter in the continuous-linear setting, i.e. the KBF. This motivates our particular setup in the linear setting. The resulting methodology we will use is the multilevel ensemble Kalman-Bucy filter (MLEnKBF). We provide an analysis based on deriving Lq-bounds for the normalizing constants using both the single-level, and the multilevel algorithms. Our results will be highlighted through numerical results, where we firstly demonstrate the error-to-cost rates of the MLEnKBF comparing it to the EnKBF on a linear Gaussian model. Our analysis will be specific to one variant of the MLEnKBF, whereas the numerics will be tested on different variants. We also exploit this methodology for parameter estimation, where we test this on the models arising in atmospheric sciences, such as the stochastic Lorenz 63 and 96 model. △ Less

Submitted 19 September, 2022; v1 submitted 9 August, 2021; originally announced August 2021.

Comments: 33 pages, 21 figures

MSC Class: 60G35; 62F15; 65C05; 62M20

arXiv:2101.11460 [pdf, other]

Log-Normalization Constant Estimation using the Ensemble Kalman-Bucy Filter with Application to High-Dimensional Models

Authors: Dan Crisan, Pierre Del Moral, Ajay Jasra, Hamza Ruzayqat

Abstract: In this article we consider the estimation of the log-normalization constant associated to a class of continuous-time filtering models. In particular, we consider ensemble Kalman-Bucy filter based estimates based upon several nonlinear Kalman-Bucy diffusions. Based upon new conditional bias results for the mean of the afore-mentioned methods, we analyze the empirical log-scale normalization consta… ▽ More In this article we consider the estimation of the log-normalization constant associated to a class of continuous-time filtering models. In particular, we consider ensemble Kalman-Bucy filter based estimates based upon several nonlinear Kalman-Bucy diffusions. Based upon new conditional bias results for the mean of the afore-mentioned methods, we analyze the empirical log-scale normalization constants in terms of their $\mathbb{L}_n-$errors and conditional bias. Depending on the type of nonlinear Kalman-Bucy diffusion, we show that these are of order $(\sqrt{t/N}) + t/N$ or $1/\sqrt{N}$ ($\mathbb{L}_n-$errors) and of order $[t+\sqrt{t}]/N$ or $1/N$ (conditional bias), where $t$ is the time horizon and $N$ is the ensemble size. Finally, we use these results for online static parameter estimation for above filtering models and implement the methodology for both linear and nonlinear models. △ Less

Submitted 27 January, 2021; originally announced January 2021.

Comments: 25 pages, 27 figures

MSC Class: 65C05; 65C20; 62F99; 62M20; 60G35

arXiv:2008.07803 [pdf, other]

Score-Based Parameter Estimation for a Class of Continuous-Time State Space Models

Authors: Alexandros Beskos, Dan Crisan, Ajay Jasra, Nikolas Kantas, Hamza Ruzayqat

Abstract: We consider the problem of parameter estimation for a class of continuous-time state space models. In particular, we explore the case of a partially observed diffusion, with data also arriving according to a diffusion process. Based upon a standard identity of the score function, we consider two particle filter based methodologies to estimate the score function. Both methods rely on an online esti… ▽ More We consider the problem of parameter estimation for a class of continuous-time state space models. In particular, we explore the case of a partially observed diffusion, with data also arriving according to a diffusion process. Based upon a standard identity of the score function, we consider two particle filter based methodologies to estimate the score function. Both methods rely on an online estimation algorithm for the score function of $\mathcal{O}(N^2)$ cost, with $N\in\mathbb{N}$ the number of particles. The first approach employs a simple Euler discretization and standard particle smoothers and is of cost $\mathcal{O}(N^2 + NΔ_l^{-1})$ per unit time, where $Δ_l=2^{-l}$, $l\in\mathbb{N}_0$, is the time-discretization step. The second approach is new and based upon a novel diffusion bridge construction. It yields a new backward type Feynman-Kac formula in continuous-time for the score function and is presented along with a particle method for its approximation. Considering a time-discretization, the cost is $\mathcal{O}(N^2Δ_l^{-1})$ per unit time. To improve computational costs, we then consider multilevel methodologies for the score function. We illustrate our parameter estimation method via stochastic gradient approaches in several numerical examples. △ Less

Submitted 15 March, 2021; v1 submitted 18 August, 2020; originally announced August 2020.

Comments: 32 pages, 32 figures

MSC Class: 65C05; 65C35; 60G35; 60J60; 60J65; 60H10; 60H35; 65C30; 91G60; 93E11; 62F99

arXiv:2002.01270 [pdf, other]

Unbiased Estimation of the Solution to Zakai's Equation

Authors: Hamza M. Ruzayqat, Ajay Jasra

Abstract: In the following article we consider the non-linear filtering problem in continuous-time and in particular the solution to Zakai's equation or the normalizing constant. We develop a methodology to produce finite variance, almost surely unbiased estimators of the solution to Zakai's equation. That is, given access to only a first order discretization of solution to the Zakai equation, we present a… ▽ More In the following article we consider the non-linear filtering problem in continuous-time and in particular the solution to Zakai's equation or the normalizing constant. We develop a methodology to produce finite variance, almost surely unbiased estimators of the solution to Zakai's equation. That is, given access to only a first order discretization of solution to the Zakai equation, we present a method which can remove this discretization bias. The approach, under assumptions, is proved to have finite variance and is numerically compared to using a particular multilevel Monte Carlo method. △ Less

Submitted 5 February, 2020; v1 submitted 4 February, 2020; originally announced February 2020.

Comments: 17 pages, 4 figures

MSC Class: 65C05; 65C35; 82C80; 93E11; 60G35

Showing 1–14 of 14 results for author: Ruzayqat, H