Search | arXiv e-print repository

An Introduction to Solving the Least-Squares Problem in Variational Data Assimilation

Authors: I. Daužickaitė, M. A. Freitag, S. Gürol, A. S. Lawless, A. Ramage, J. A. Scott, J. M. Tabeart

Abstract: Variational data assimilation is a technique for combining measured data with dynamical models. It is a key component of Earth system state estimation and is commonly used in weather and ocean forecasting. The approach involves a large-scale generalized nonlinear least-squares problem. Solving the resulting sequence of sparse linear subproblems requires the use of sophisticated numerical linear al… ▽ More Variational data assimilation is a technique for combining measured data with dynamical models. It is a key component of Earth system state estimation and is commonly used in weather and ocean forecasting. The approach involves a large-scale generalized nonlinear least-squares problem. Solving the resulting sequence of sparse linear subproblems requires the use of sophisticated numerical linear algebra methods. In practical applications, the computational demands severely limit the number of iterations of a Krylov subspace solver that can be performed and so high-quality preconditioners are vital. In this paper, we introduce variational data assimilation from a numerical linear algebra perspective and review current solution techniques, with a focus on the challenges that arise in large-scale geophysical systems. △ Less

Submitted 10 June, 2025; originally announced June 2025.

MSC Class: 65F05; 65F08; 65F10; 65K10

arXiv:2506.03947 [pdf, ps, other]

Block Alpha-Circulant Preconditioners for All-at-Once Diffusion-Based Covariance Operators

Authors: Jemima M. Tabeart, Selime Gürol, John W. Pearson, Anthony T. Weaver

Abstract: Covariance matrices are central to data assimilation and inverse methods derived from statistical estimation theory. Previous work has considered the application of an all-at-once diffusion-based representation of a covariance matrix operator in order to exploit inherent parallellism in the underlying problem. In this paper, we provide practical methods to apply block $α$-circulant preconditioners… ▽ More Covariance matrices are central to data assimilation and inverse methods derived from statistical estimation theory. Previous work has considered the application of an all-at-once diffusion-based representation of a covariance matrix operator in order to exploit inherent parallellism in the underlying problem. In this paper, we provide practical methods to apply block $α$-circulant preconditioners to the all-at-once system for the case where the main diffusion operation matrix cannot be readily diagonalized using a discrete Fourier transform. Our new framework applies the block $α$-circulant preconditioner approximately by solving an inner block diagonal problem via a choice of inner iterative approaches. Our first method applies Chebyshev semi-iteration to a symmetric positive definite matrix, shifted by a complex scaling of the identity. We extend theoretical results for Chebyshev semi-iteration in the symmetric positive definite setting, to obtain computable bounds on the asymptotic convergence factor for each of the complex sub-problems. The second approach transforms the complex sub-problem into a (generalized) saddle point system with real coefficients. Numerical experiments reveal that in the case of unlimited computational resources, both methods can match the iteration counts of the `best-case' block $α$-circulant preconditioner. We also provide a practical adaptation to the nested Chebyshev approach, which improves performance in the case of a limited computational budget. Using an appropriate choice of $α$ our new approaches are robust and efficient in terms of outer iterations and matrix--vector products. △ Less

Submitted 16 June, 2025; v1 submitted 4 June, 2025; originally announced June 2025.

Comments: 27 pages, 8 figures, 8 Tables

arXiv:2503.09140 [pdf, other]

On the impact of observation error correlations in data assimilation, with application to along-track altimeter data

Authors: Olivier Goux, Anthony Weaver, Selime Gürol, Oliver Guillet, Andrea Piacentini

Abstract: Data assimilation involves estimating the state of a system by combining observations from various sources with a background estimate of the state. The weights given to the observations and background state depend on their specified error covariance matrices. Observation errors are often assumed to be uncorrelated even though this assumption is inaccurate for many modern data-sets such as those fr… ▽ More Data assimilation involves estimating the state of a system by combining observations from various sources with a background estimate of the state. The weights given to the observations and background state depend on their specified error covariance matrices. Observation errors are often assumed to be uncorrelated even though this assumption is inaccurate for many modern data-sets such as those from satellite observing systems. As methods allowing for a more realistic representation of observation-error correlations are emerging, our aim in this article is to provide insight on their expected impact in data assimilation. First, we use a simple idealised system to analyse the effect of observation-error correlations on the spectral characteristics of the solution. Next, we assess the relevance of these results in a more realistic setting in which simulated alongtrack (nadir) altimeter observations with correlated errors are assimilated in a global ocean model using a three-dimensional variational assimilation (3D-Var) method. Correlated observation errors are modelled in the 3D-Var system using a diffusion operator. When the correlation length scale of observation error is small compared to that of background error, inflating the observation-error variances can mitigate most of the negative effects from neglecting the observation-error correlations. Accounting for observation-error correlations in this situation still outperforms variance inflation since it allows small-scale information in the observations to be more effectively extracted and does not affect the convergence of the minimization. Conversely, when the correlation length scale of observation error is large compared to that of background error, the effect of observation-error correlations cannot be properly approximated with variance inflation. However, the correlation model needs to be constructed carefully to ensure the minimization problem is adequately conditioned so that a robust solution can be obtained. Practical ways to achieve this are discussed. △ Less

Submitted 12 March, 2025; originally announced March 2025.

arXiv:2410.02204 [pdf, other]

doi 10.13140/RG.2.2.28678.38725

An Efficient Scaled spectral preconditioner for sequences of symmetric positive definite linear systems

Authors: Youssef Diouane, Selime Gürol, Oussama Mouhtal, Dominique Orban

Abstract: We explore a scaled spectral preconditioner for the efficient solution of sequences of symmetric and positive-definite linear systems. We design the scaled preconditioner not only as an approximation of the inverse of the linear system but also with consideration of its use within the conjugate gradient (CG) method. We propose three different strategies for selecting a scaling parameter, which aim… ▽ More We explore a scaled spectral preconditioner for the efficient solution of sequences of symmetric and positive-definite linear systems. We design the scaled preconditioner not only as an approximation of the inverse of the linear system but also with consideration of its use within the conjugate gradient (CG) method. We propose three different strategies for selecting a scaling parameter, which aims to position the eigenvalues of the preconditioned matrix in a way that reduces the energy norm of the error, the quantity that CG monotonically decreases at each iteration. Our focus is on accelerating convergence especially in the early iterations, which is particularly important when CG is truncated due to computational cost constraints. Numerical experiments provide in data assimilation confirm that the scaled spectral preconditioner can significantly improve early CG convergence with negligible computational cost. △ Less

Submitted 3 October, 2024; originally announced October 2024.

Report number: G-2024-66 MSC Class: 68Q25; 68R10; 68U05

arXiv:2405.04811 [pdf, other]

A general error analysis for randomized low-rank approximation with application to data assimilation

Authors: Alexandre Scotto Di Perrotolo, Youssef Diouane, Selime Gürol, Xavier Vasseur

Abstract: Randomized algorithms have proven to perform well on a large class of numerical linear algebra problems. Their theoretical analysis is critical to provide guarantees on their behaviour, and in this sense, the stochastic analysis of the randomized low-rank approximation error plays a central role. Indeed, several randomized methods for the approximation of dominant eigen- or singular modes can be r… ▽ More Randomized algorithms have proven to perform well on a large class of numerical linear algebra problems. Their theoretical analysis is critical to provide guarantees on their behaviour, and in this sense, the stochastic analysis of the randomized low-rank approximation error plays a central role. Indeed, several randomized methods for the approximation of dominant eigen- or singular modes can be rewritten as low-rank approximation methods. However, despite the large variety of algorithms, the existing theoretical frameworks for their analysis rely on a specific structure for the covariance matrix that is not adapted to all the algorithms. We propose a general framework for the stochastic analysis of the low-rank approximation error in Frobenius norm for centered and non-standard Gaussian matrices. Under minimal assumptions on the covariance matrix, we derive accurate bounds both in expectation and probability. Our bounds have clear interpretations that enable us to derive properties and motivate practical choices for the covariance matrix resulting in efficient low-rank approximation algorithms. The most commonly used bounds in the literature have been demonstrated as a specific instance of the bounds proposed here, with the additional contribution of being tighter. Numerical experiments related to data assimilation further illustrate that exploiting the problem structure to select the covariance matrix improves the performance as suggested by our bounds. △ Less

Submitted 8 May, 2024; originally announced May 2024.

arXiv:2311.06069 [pdf, ps, other]

A filtered multilevel Monte Carlo method for estimating the expectation of cell-centered discretized random fields

Authors: Jérémy Briant, Paul Mycek, Mayeul Destouches, Olivier Goux, Serge Gratton, Selime Gürol, Ehouarn Simon, Anthony T. Weaver

Abstract: In this paper, we investigate the use of multilevel Monte Carlo (MLMC) methods for estimating the expectation of discretized random fields. Specifically, we consider a setting in which the input and output vectors of numerical simulators have inconsistent dimensions across the multilevel hierarchy. This motivates the introduction of grid transfer operators borrowed from multigrid methods. By adapt… ▽ More In this paper, we investigate the use of multilevel Monte Carlo (MLMC) methods for estimating the expectation of discretized random fields. Specifically, we consider a setting in which the input and output vectors of numerical simulators have inconsistent dimensions across the multilevel hierarchy. This motivates the introduction of grid transfer operators borrowed from multigrid methods. By adapting mathematical tools from multigrid methods, we perform a theoretical spectral analysis of the MLMC estimator of the expectation of discretized random fields, in the specific case of linear, symmetric and circulant simulators. We then propose filtered MLMC (F-MLMC) estimators based on a filtering mechanism similar to the smoothing process of multigrid methods, and we show that the filtering operators improve the estimation of both the small- and large-scale components of the variance, resulting in a reduction of the total variance of the estimator. Next, the conclusions of the spectral analysis are experimentally verified with a one-dimensional illustration. Finally, the proposed F-MLMC estimator is applied to the problem of estimating the discretized variance field of a diffusion-based covariance operator, which amounts to estimating the expectation of a discretized random field. The numerical experiments support the conclusions of the theoretical analysis even with non-linear simulators, and demonstrate the improvements brought by the F-MLMC estimator compared to both a crude MC and an unfiltered MLMC estimator. △ Less

Submitted 4 June, 2025; v1 submitted 10 November, 2023; originally announced November 2023.

MSC Class: 65C05; 62P12

arXiv:2306.07017 [pdf, ps, other]

Multivariate extensions of the Multilevel Best Linear Unbiased Estimator for ensemble-variational data assimilation

Authors: Mayeul Destouches, Paul Mycek, Selime Gürol

Abstract: Multilevel estimators aim at reducing the variance of Monte Carlo statistical estimators, by combining samples generated with simulators of different costs and accuracies. In particular, the recent work of Schaden and Ullmann (2020) on the multilevel best linear unbiased estimator (MLBLUE) introduces a framework unifying several multilevel and multifidelity techniques. The MLBLUE is reintroduced h… ▽ More Multilevel estimators aim at reducing the variance of Monte Carlo statistical estimators, by combining samples generated with simulators of different costs and accuracies. In particular, the recent work of Schaden and Ullmann (2020) on the multilevel best linear unbiased estimator (MLBLUE) introduces a framework unifying several multilevel and multifidelity techniques. The MLBLUE is reintroduced here using a variance minimization approach rather than the regression approach of Schaden and Ullmann. We then discuss possible extensions of the scalar MLBLUE to a multidimensional setting, i.e. from the expectation of scalar random variables to the expectation of random vectors. Several estimators of increasing complexity are proposed: a) multilevel estimators with scalar weights, b) with element-wise weights, c) with spectral weights and d) with general matrix weights. The computational cost of each method is discussed. We finally extend the MLBLUE to the estimation of second-order moments in the multidimensional case, i.e. to the estimation of covariance matrices. The multilevel estimators proposed are d) a multilevel estimator with scalar weights and e) with element-wise weights. In large-dimension applications such as data assimilation for geosciences, the latter estimator is computationnally unaffordable. As a remedy, we also propose f) a multilevel covariance matrix estimator with optimal multilevel localization, inspired by the optimal localization theory of Ménétrier and Auligné (2015). Some practical details on weighted MLMC estimators of covariance matrices are given in appendix. △ Less

Submitted 12 September, 2024; v1 submitted 12 June, 2023; originally announced June 2023.

Comments: CERFACS Technical Report

Report number: TR-PA-23-67

arXiv:2212.02305 [pdf, other]

Impact of correlated observation errors on the convergence of the conjugate gradient algorithm in variational data assimilation

Authors: Olivier Goux, Selime Gürol, Anthony T. Weaver, Oliver Guillet, Youssef Diouane

Abstract: An important class of nonlinear weighted least-squares problems arises from the assimilation of observations in atmospheric and ocean models. In variational data assimilation, inverse error covariance matrices define the weighting matrices of the least-squares problem. For observation errors, a diagonal matrix (i.e., uncorrelated errors) is often assumed for simplicity even when observation errors… ▽ More An important class of nonlinear weighted least-squares problems arises from the assimilation of observations in atmospheric and ocean models. In variational data assimilation, inverse error covariance matrices define the weighting matrices of the least-squares problem. For observation errors, a diagonal matrix (i.e., uncorrelated errors) is often assumed for simplicity even when observation errors are suspected to be correlated. While accounting for observationerror correlations should improve the quality of the solution, it also affects the convergence rate of the minimization algorithms used to iterate to the solution. If the minimization process is stopped before reaching full convergence, which is usually the case in operational applications, the solution may be degraded even if the observation-error correlations are correctly accounted for. In this article, we explore the influence of the observation-error correlation matrix (R) on the convergence rate of a preconditioned conjugate gradient (PCG) algorithm applied to a one-dimensional variational data assimilation (1D-Var) problem. We design the idealised 1D-Var system to include two key features used in more complex systems: we use the background error covariance matrix (B) as a preconditioner (B-PCG); and we use a diffusion operator to model spatial correlations in B and R. Analytical and numerical results with the 1D-Var system show a strong sensitivity of the convergence rate of B-PCG to the parameters of the diffusion-based correlation models. Depending on the parameter choices, correlated observation errors can either speed up or slow down the convergence. In practice, a compromise may be required in the parameter specifications of B and R between staying close to the best available estimates on the one hand and ensuring an adequate convergence rate of the minimization algorithm on the other. △ Less

Submitted 5 December, 2022; originally announced December 2022.

arXiv:2206.08793 [pdf, other]

A general error analysis for randomized low-rank approximation methods

Authors: Youssef Diouane, Selime Gürol, Alexandre Scotto Di Perrotolo, Xavier Vasseur

Abstract: We propose a general error analysis related to the low-rank approximation of a given real matrix in both the spectral and Frobenius norms. First, we derive deterministic error bounds that hold with some minimal assumptions. Second, we derive error bounds in expectation in the non-standard Gaussian case, assuming a non-trivial mean and a general covariance matrix for the random matrix variable. The… ▽ More We propose a general error analysis related to the low-rank approximation of a given real matrix in both the spectral and Frobenius norms. First, we derive deterministic error bounds that hold with some minimal assumptions. Second, we derive error bounds in expectation in the non-standard Gaussian case, assuming a non-trivial mean and a general covariance matrix for the random matrix variable. The proposed analysis generalizes and improves the error bounds for spectral and Frobenius norms proposed by Halko, Martinsson and Tropp. Third, we consider the Randomized Singular Value Decomposition and specialize our error bounds in expectation in this setting. Numerical experiments on an instructional synthetic test case demonstrate the tightness of the new error bounds. △ Less

Submitted 20 June, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

arXiv:2104.00430 [pdf, other]

doi 10.1002/qj.4153

Latent Space Data Assimilation by using Deep Learning

Authors: Mathis Peyron, Anthony Fillion, Selime Gürol, Victor Marchais, Serge Gratton, Pierre Boudier, Gael Goret

Abstract: Performing Data Assimilation (DA) at a low cost is of prime concern in Earth system modeling, particularly at the time of big data where huge quantities of observations are available. Capitalizing on the ability of Neural Networks techniques for approximating the solution of PDE's, we incorporate Deep Learning (DL) methods into a DA framework. More precisely, we exploit the latent structure provid… ▽ More Performing Data Assimilation (DA) at a low cost is of prime concern in Earth system modeling, particularly at the time of big data where huge quantities of observations are available. Capitalizing on the ability of Neural Networks techniques for approximating the solution of PDE's, we incorporate Deep Learning (DL) methods into a DA framework. More precisely, we exploit the latent structure provided by autoencoders (AEs) to design an Ensemble Transform Kalman Filter with model error (ETKF-Q) in the latent space. Model dynamics are also propagated within the latent space via a surrogate neural network. This novel ETKF-Q-Latent (thereafter referred to as ETKF-Q-L) algorithm is tested on a tailored instructional version of Lorenz 96 equations, named the augmented Lorenz 96 system: it possesses a latent structure that accurately represents the observed dynamics. Numerical experiments based on this particular system evidence that the ETKF-Q-L approach both reduces the computational cost and provides better accuracy than state of the art algorithms, such as the ETKF-Q. △ Less

Submitted 1 April, 2021; originally announced April 2021.

Comments: 15 pages, 7 figures and 3 tables

arXiv:2010.09694 [pdf, other]

Data Assimilation Networks

Authors: Pierre Boudier, Anthony Fillion, Serge Gratton, Selime Gürol, Sixin Zhang

Abstract: Data assimilation (DA) aims at forecasting the state of a dynamical system by combining a mathematical representation of the system with noisy observations taking into account their uncertainties. State of the art methods are based on the Gaussian error statistics and the linearization of the non-linear dynamics which may lead to sub-optimal methods. In this respect, there are still open questions… ▽ More Data assimilation (DA) aims at forecasting the state of a dynamical system by combining a mathematical representation of the system with noisy observations taking into account their uncertainties. State of the art methods are based on the Gaussian error statistics and the linearization of the non-linear dynamics which may lead to sub-optimal methods. In this respect, there are still open questions how to improve these methods. In this paper, we propose a fully data driven deep learning architecture generalizing recurrent Elman networks and data assimilation algorithms which approximate a sequence of prior and posterior densities conditioned on noisy observations. By construction our approach can be used for general nonlinear dynamics and non-Gaussian densities. On numerical experiments based on the well-known Lorenz-95 system and with Gaussian error statistics, our architecture achieves comparable performance to EnKF on both the analysis and the propagation of probability density functions of the system state at a given time without using any explicit regularization technique. △ Less

Submitted 25 May, 2023; v1 submitted 19 October, 2020; originally announced October 2020.

MSC Class: 93B07; 93E11; 60G35; 68T07; 94A17

arXiv:1709.09031 [pdf, other]

doi 10.1002/qj.3262

A note on preconditioning weighted linear least squares, with consequences for weakly-constrained variational data assimilation

Authors: Serge Gratton, Selime Gürol, Ehouarn Simon, Philippe L. Toint

Abstract: The effect of preconditioning linear weighted least-squares using an approximation of the model matrix is analyzed, showing the interplay of the eigenstructures of both the model and weighting matrices. A small example is given illustrating the resulting potential inefficiency of such preconditioners. Consequences of these results in the context of the weakly-constrained 4D-Var data assimilation p… ▽ More The effect of preconditioning linear weighted least-squares using an approximation of the model matrix is analyzed, showing the interplay of the eigenstructures of both the model and weighting matrices. A small example is given illustrating the resulting potential inefficiency of such preconditioners. Consequences of these results in the context of the weakly-constrained 4D-Var data assimilation problem are finally discussed. △ Less

Submitted 26 September, 2017; originally announced September 2017.

Comments: 10 pages, 2 figures

MSC Class: 86A5; 86A10; 90C06; 90C30; 15A12 ACM Class: G.1.3; G.1.6

Journal ref: Quarterly Journal of the Royal Meteorological Society, vol. 144(172), pp. 934--940, 2018

arXiv:1709.06383 [pdf, ps, other]

doi 10.1002/qj.3355

On the use of the saddle formulation in weakly-constrained 4D-VAR data assimilation

Authors: S. Gratton, S. Gürol, E. Simon, Ph. L. Toint

Abstract: This paper discusses the practical use of the saddle variational formulation for the weakly-constrained 4D-VAR method in data assimilation. It is shown that the method, in its original form, may produce erratic results or diverge because of the inherent lack of monotonicity of the produced objective function values. Convergent, variationaly coherent variants of the algorithm are then proposed whos… ▽ More This paper discusses the practical use of the saddle variational formulation for the weakly-constrained 4D-VAR method in data assimilation. It is shown that the method, in its original form, may produce erratic results or diverge because of the inherent lack of monotonicity of the produced objective function values. Convergent, variationaly coherent variants of the algorithm are then proposed whose practical performance is compared to that of other formulations. This comparison is conducted on two data assimilation instances (Burgers equation and the Quasi-Geostrophic model), using two different assumptions on parallel computing environment. Because these variants essentially retain the parallelization advantages of the original proposal, they often --- but not always --- perform best, even for moderate numbers of computing processes. △ Less

Submitted 19 September, 2017; originally announced September 2017.

Journal ref: Quarterly Journal of the Royal Meteorological Society, 144(717), pp. 2792-2602, 2018

Showing 1–13 of 13 results for author: Gürol, S