-
An Introduction to Solving the Least-Squares Problem in Variational Data Assimilation
Authors:
I. Daužickaitė,
M. A. Freitag,
S. Gürol,
A. S. Lawless,
A. Ramage,
J. A. Scott,
J. M. Tabeart
Abstract:
Variational data assimilation is a technique for combining measured data with dynamical models. It is a key component of Earth system state estimation and is commonly used in weather and ocean forecasting. The approach involves a large-scale generalized nonlinear least-squares problem. Solving the resulting sequence of sparse linear subproblems requires the use of sophisticated numerical linear al…
▽ More
Variational data assimilation is a technique for combining measured data with dynamical models. It is a key component of Earth system state estimation and is commonly used in weather and ocean forecasting. The approach involves a large-scale generalized nonlinear least-squares problem. Solving the resulting sequence of sparse linear subproblems requires the use of sophisticated numerical linear algebra methods. In practical applications, the computational demands severely limit the number of iterations of a Krylov subspace solver that can be performed and so high-quality preconditioners are vital. In this paper, we introduce variational data assimilation from a numerical linear algebra perspective and review current solution techniques, with a focus on the challenges that arise in large-scale geophysical systems.
△ Less
Submitted 10 June, 2025;
originally announced June 2025.
-
Block Alpha-Circulant Preconditioners for All-at-Once Diffusion-Based Covariance Operators
Authors:
Jemima M. Tabeart,
Selime Gürol,
John W. Pearson,
Anthony T. Weaver
Abstract:
Covariance matrices are central to data assimilation and inverse methods derived from statistical estimation theory. Previous work has considered the application of an all-at-once diffusion-based representation of a covariance matrix operator in order to exploit inherent parallellism in the underlying problem. In this paper, we provide practical methods to apply block $α$-circulant preconditioners…
▽ More
Covariance matrices are central to data assimilation and inverse methods derived from statistical estimation theory. Previous work has considered the application of an all-at-once diffusion-based representation of a covariance matrix operator in order to exploit inherent parallellism in the underlying problem. In this paper, we provide practical methods to apply block $α$-circulant preconditioners to the all-at-once system for the case where the main diffusion operation matrix cannot be readily diagonalized using a discrete Fourier transform. Our new framework applies the block $α$-circulant preconditioner approximately by solving an inner block diagonal problem via a choice of inner iterative approaches. Our first method applies Chebyshev semi-iteration to a symmetric positive definite matrix, shifted by a complex scaling of the identity. We extend theoretical results for Chebyshev semi-iteration in the symmetric positive definite setting, to obtain computable bounds on the asymptotic convergence factor for each of the complex sub-problems. The second approach transforms the complex sub-problem into a (generalized) saddle point system with real coefficients. Numerical experiments reveal that in the case of unlimited computational resources, both methods can match the iteration counts of the `best-case' block $α$-circulant preconditioner. We also provide a practical adaptation to the nested Chebyshev approach, which improves performance in the case of a limited computational budget. Using an appropriate choice of $α$ our new approaches are robust and efficient in terms of outer iterations and matrix--vector products.
△ Less
Submitted 16 June, 2025; v1 submitted 4 June, 2025;
originally announced June 2025.
-
On the impact of observation error correlations in data assimilation, with application to along-track altimeter data
Authors:
Olivier Goux,
Anthony Weaver,
Selime Gürol,
Oliver Guillet,
Andrea Piacentini
Abstract:
Data assimilation involves estimating the state of a system by combining observations from various sources with a background estimate of the state. The weights given to the observations and background state depend on their specified error covariance matrices. Observation errors are often assumed to be uncorrelated even though this assumption is inaccurate for many modern data-sets such as those fr…
▽ More
Data assimilation involves estimating the state of a system by combining observations from various sources with a background estimate of the state. The weights given to the observations and background state depend on their specified error covariance matrices. Observation errors are often assumed to be uncorrelated even though this assumption is inaccurate for many modern data-sets such as those from satellite observing systems. As methods allowing for a more realistic representation of observation-error correlations are emerging, our aim in this article is to provide insight on their expected impact in data assimilation. First, we use a simple idealised system to analyse the effect of observation-error correlations on the spectral characteristics of the solution. Next, we assess the relevance of these results in a more realistic setting in which simulated alongtrack (nadir) altimeter observations with correlated errors are assimilated in a global ocean model using a three-dimensional variational assimilation (3D-Var) method. Correlated observation errors are modelled in the 3D-Var system using a diffusion operator. When the correlation length scale of observation error is small compared to that of background error, inflating the observation-error variances can mitigate most of the negative effects from neglecting the observation-error correlations. Accounting for observation-error correlations in this situation still outperforms variance inflation since it allows small-scale information in the observations to be more effectively extracted and does not affect the convergence of the minimization. Conversely, when the correlation length scale of observation error is large compared to that of background error, the effect of observation-error correlations cannot be properly approximated with variance inflation. However, the correlation model needs to be constructed carefully to ensure the minimization problem is adequately conditioned so that a robust solution can be obtained. Practical ways to achieve this are discussed.
△ Less
Submitted 12 March, 2025;
originally announced March 2025.
-
An Efficient Scaled spectral preconditioner for sequences of symmetric positive definite linear systems
Authors:
Youssef Diouane,
Selime Gürol,
Oussama Mouhtal,
Dominique Orban
Abstract:
We explore a scaled spectral preconditioner for the efficient solution of sequences of symmetric and positive-definite linear systems. We design the scaled preconditioner not only as an approximation of the inverse of the linear system but also with consideration of its use within the conjugate gradient (CG) method. We propose three different strategies for selecting a scaling parameter, which aim…
▽ More
We explore a scaled spectral preconditioner for the efficient solution of sequences of symmetric and positive-definite linear systems. We design the scaled preconditioner not only as an approximation of the inverse of the linear system but also with consideration of its use within the conjugate gradient (CG) method. We propose three different strategies for selecting a scaling parameter, which aims to position the eigenvalues of the preconditioned matrix in a way that reduces the energy norm of the error, the quantity that CG monotonically decreases at each iteration. Our focus is on accelerating convergence especially in the early iterations, which is particularly important when CG is truncated due to computational cost constraints. Numerical experiments provide in data assimilation confirm that the scaled spectral preconditioner can significantly improve early CG convergence with negligible computational cost.
△ Less
Submitted 3 October, 2024;
originally announced October 2024.
-
A general error analysis for randomized low-rank approximation with application to data assimilation
Authors:
Alexandre Scotto Di Perrotolo,
Youssef Diouane,
Selime Gürol,
Xavier Vasseur
Abstract:
Randomized algorithms have proven to perform well on a large class of numerical linear algebra problems. Their theoretical analysis is critical to provide guarantees on their behaviour, and in this sense, the stochastic analysis of the randomized low-rank approximation error plays a central role. Indeed, several randomized methods for the approximation of dominant eigen- or singular modes can be r…
▽ More
Randomized algorithms have proven to perform well on a large class of numerical linear algebra problems. Their theoretical analysis is critical to provide guarantees on their behaviour, and in this sense, the stochastic analysis of the randomized low-rank approximation error plays a central role. Indeed, several randomized methods for the approximation of dominant eigen- or singular modes can be rewritten as low-rank approximation methods. However, despite the large variety of algorithms, the existing theoretical frameworks for their analysis rely on a specific structure for the covariance matrix that is not adapted to all the algorithms. We propose a general framework for the stochastic analysis of the low-rank approximation error in Frobenius norm for centered and non-standard Gaussian matrices. Under minimal assumptions on the covariance matrix, we derive accurate bounds both in expectation and probability. Our bounds have clear interpretations that enable us to derive properties and motivate practical choices for the covariance matrix resulting in efficient low-rank approximation algorithms. The most commonly used bounds in the literature have been demonstrated as a specific instance of the bounds proposed here, with the additional contribution of being tighter. Numerical experiments related to data assimilation further illustrate that exploiting the problem structure to select the covariance matrix improves the performance as suggested by our bounds.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
A filtered multilevel Monte Carlo method for estimating the expectation of cell-centered discretized random fields
Authors:
Jérémy Briant,
Paul Mycek,
Mayeul Destouches,
Olivier Goux,
Serge Gratton,
Selime Gürol,
Ehouarn Simon,
Anthony T. Weaver
Abstract:
In this paper, we investigate the use of multilevel Monte Carlo (MLMC) methods for estimating the expectation of discretized random fields. Specifically, we consider a setting in which the input and output vectors of numerical simulators have inconsistent dimensions across the multilevel hierarchy. This motivates the introduction of grid transfer operators borrowed from multigrid methods. By adapt…
▽ More
In this paper, we investigate the use of multilevel Monte Carlo (MLMC) methods for estimating the expectation of discretized random fields. Specifically, we consider a setting in which the input and output vectors of numerical simulators have inconsistent dimensions across the multilevel hierarchy. This motivates the introduction of grid transfer operators borrowed from multigrid methods. By adapting mathematical tools from multigrid methods, we perform a theoretical spectral analysis of the MLMC estimator of the expectation of discretized random fields, in the specific case of linear, symmetric and circulant simulators. We then propose filtered MLMC (F-MLMC) estimators based on a filtering mechanism similar to the smoothing process of multigrid methods, and we show that the filtering operators improve the estimation of both the small- and large-scale components of the variance, resulting in a reduction of the total variance of the estimator. Next, the conclusions of the spectral analysis are experimentally verified with a one-dimensional illustration. Finally, the proposed F-MLMC estimator is applied to the problem of estimating the discretized variance field of a diffusion-based covariance operator, which amounts to estimating the expectation of a discretized random field. The numerical experiments support the conclusions of the theoretical analysis even with non-linear simulators, and demonstrate the improvements brought by the F-MLMC estimator compared to both a crude MC and an unfiltered MLMC estimator.
△ Less
Submitted 4 June, 2025; v1 submitted 10 November, 2023;
originally announced November 2023.
-
Multivariate extensions of the Multilevel Best Linear Unbiased Estimator for ensemble-variational data assimilation
Authors:
Mayeul Destouches,
Paul Mycek,
Selime Gürol
Abstract:
Multilevel estimators aim at reducing the variance of Monte Carlo statistical estimators, by combining samples generated with simulators of different costs and accuracies. In particular, the recent work of Schaden and Ullmann (2020) on the multilevel best linear unbiased estimator (MLBLUE) introduces a framework unifying several multilevel and multifidelity techniques. The MLBLUE is reintroduced h…
▽ More
Multilevel estimators aim at reducing the variance of Monte Carlo statistical estimators, by combining samples generated with simulators of different costs and accuracies. In particular, the recent work of Schaden and Ullmann (2020) on the multilevel best linear unbiased estimator (MLBLUE) introduces a framework unifying several multilevel and multifidelity techniques. The MLBLUE is reintroduced here using a variance minimization approach rather than the regression approach of Schaden and Ullmann. We then discuss possible extensions of the scalar MLBLUE to a multidimensional setting, i.e. from the expectation of scalar random variables to the expectation of random vectors. Several estimators of increasing complexity are proposed: a) multilevel estimators with scalar weights, b) with element-wise weights, c) with spectral weights and d) with general matrix weights. The computational cost of each method is discussed. We finally extend the MLBLUE to the estimation of second-order moments in the multidimensional case, i.e. to the estimation of covariance matrices. The multilevel estimators proposed are d) a multilevel estimator with scalar weights and e) with element-wise weights. In large-dimension applications such as data assimilation for geosciences, the latter estimator is computationnally unaffordable. As a remedy, we also propose f) a multilevel covariance matrix estimator with optimal multilevel localization, inspired by the optimal localization theory of Ménétrier and Auligné (2015). Some practical details on weighted MLMC estimators of covariance matrices are given in appendix.
△ Less
Submitted 12 September, 2024; v1 submitted 12 June, 2023;
originally announced June 2023.
-
Impact of correlated observation errors on the convergence of the conjugate gradient algorithm in variational data assimilation
Authors:
Olivier Goux,
Selime Gürol,
Anthony T. Weaver,
Oliver Guillet,
Youssef Diouane
Abstract:
An important class of nonlinear weighted least-squares problems arises from the assimilation of observations in atmospheric and ocean models. In variational data assimilation, inverse error covariance matrices define the weighting matrices of the least-squares problem. For observation errors, a diagonal matrix (i.e., uncorrelated errors) is often assumed for simplicity even when observation errors…
▽ More
An important class of nonlinear weighted least-squares problems arises from the assimilation of observations in atmospheric and ocean models. In variational data assimilation, inverse error covariance matrices define the weighting matrices of the least-squares problem. For observation errors, a diagonal matrix (i.e., uncorrelated errors) is often assumed for simplicity even when observation errors are suspected to be correlated. While accounting for observationerror correlations should improve the quality of the solution, it also affects the convergence rate of the minimization algorithms used to iterate to the solution. If the minimization process is stopped before reaching full convergence, which is usually the case in operational applications, the solution may be degraded even if the observation-error correlations are correctly accounted for. In this article, we explore the influence of the observation-error correlation matrix (R) on the convergence rate of a preconditioned conjugate gradient (PCG) algorithm applied to a one-dimensional variational data assimilation (1D-Var) problem. We design the idealised 1D-Var system to include two key features used in more complex systems: we use the background error covariance matrix (B) as a preconditioner (B-PCG); and we use a diffusion operator to model spatial correlations in B and R. Analytical and numerical results with the 1D-Var system show a strong sensitivity of the convergence rate of B-PCG to the parameters of the diffusion-based correlation models. Depending on the parameter choices, correlated observation errors can either speed up or slow down the convergence. In practice, a compromise may be required in the parameter specifications of B and R between staying close to the best available estimates on the one hand and ensuring an adequate convergence rate of the minimization algorithm on the other.
△ Less
Submitted 5 December, 2022;
originally announced December 2022.
-
A general error analysis for randomized low-rank approximation methods
Authors:
Youssef Diouane,
Selime Gürol,
Alexandre Scotto Di Perrotolo,
Xavier Vasseur
Abstract:
We propose a general error analysis related to the low-rank approximation of a given real matrix in both the spectral and Frobenius norms. First, we derive deterministic error bounds that hold with some minimal assumptions. Second, we derive error bounds in expectation in the non-standard Gaussian case, assuming a non-trivial mean and a general covariance matrix for the random matrix variable. The…
▽ More
We propose a general error analysis related to the low-rank approximation of a given real matrix in both the spectral and Frobenius norms. First, we derive deterministic error bounds that hold with some minimal assumptions. Second, we derive error bounds in expectation in the non-standard Gaussian case, assuming a non-trivial mean and a general covariance matrix for the random matrix variable. The proposed analysis generalizes and improves the error bounds for spectral and Frobenius norms proposed by Halko, Martinsson and Tropp. Third, we consider the Randomized Singular Value Decomposition and specialize our error bounds in expectation in this setting. Numerical experiments on an instructional synthetic test case demonstrate the tightness of the new error bounds.
△ Less
Submitted 20 June, 2022; v1 submitted 17 June, 2022;
originally announced June 2022.
-
Latent Space Data Assimilation by using Deep Learning
Authors:
Mathis Peyron,
Anthony Fillion,
Selime Gürol,
Victor Marchais,
Serge Gratton,
Pierre Boudier,
Gael Goret
Abstract:
Performing Data Assimilation (DA) at a low cost is of prime concern in Earth system modeling, particularly at the time of big data where huge quantities of observations are available. Capitalizing on the ability of Neural Networks techniques for approximating the solution of PDE's, we incorporate Deep Learning (DL) methods into a DA framework. More precisely, we exploit the latent structure provid…
▽ More
Performing Data Assimilation (DA) at a low cost is of prime concern in Earth system modeling, particularly at the time of big data where huge quantities of observations are available. Capitalizing on the ability of Neural Networks techniques for approximating the solution of PDE's, we incorporate Deep Learning (DL) methods into a DA framework. More precisely, we exploit the latent structure provided by autoencoders (AEs) to design an Ensemble Transform Kalman Filter with model error (ETKF-Q) in the latent space. Model dynamics are also propagated within the latent space via a surrogate neural network. This novel ETKF-Q-Latent (thereafter referred to as ETKF-Q-L) algorithm is tested on a tailored instructional version of Lorenz 96 equations, named the augmented Lorenz 96 system: it possesses a latent structure that accurately represents the observed dynamics. Numerical experiments based on this particular system evidence that the ETKF-Q-L approach both reduces the computational cost and provides better accuracy than state of the art algorithms, such as the ETKF-Q.
△ Less
Submitted 1 April, 2021;
originally announced April 2021.
-
Data Assimilation Networks
Authors:
Pierre Boudier,
Anthony Fillion,
Serge Gratton,
Selime Gürol,
Sixin Zhang
Abstract:
Data assimilation (DA) aims at forecasting the state of a dynamical system by combining a mathematical representation of the system with noisy observations taking into account their uncertainties. State of the art methods are based on the Gaussian error statistics and the linearization of the non-linear dynamics which may lead to sub-optimal methods. In this respect, there are still open questions…
▽ More
Data assimilation (DA) aims at forecasting the state of a dynamical system by combining a mathematical representation of the system with noisy observations taking into account their uncertainties. State of the art methods are based on the Gaussian error statistics and the linearization of the non-linear dynamics which may lead to sub-optimal methods. In this respect, there are still open questions how to improve these methods. In this paper, we propose a fully data driven deep learning architecture generalizing recurrent Elman networks and data assimilation algorithms which approximate a sequence of prior and posterior densities conditioned on noisy observations. By construction our approach can be used for general nonlinear dynamics and non-Gaussian densities. On numerical experiments based on the well-known Lorenz-95 system and with Gaussian error statistics, our architecture achieves comparable performance to EnKF on both the analysis and the propagation of probability density functions of the system state at a given time without using any explicit regularization technique.
△ Less
Submitted 25 May, 2023; v1 submitted 19 October, 2020;
originally announced October 2020.
-
A note on preconditioning weighted linear least squares, with consequences for weakly-constrained variational data assimilation
Authors:
Serge Gratton,
Selime Gürol,
Ehouarn Simon,
Philippe L. Toint
Abstract:
The effect of preconditioning linear weighted least-squares using an approximation of the model matrix is analyzed, showing the interplay of the eigenstructures of both the model and weighting matrices. A small example is given illustrating the resulting potential inefficiency of such preconditioners. Consequences of these results in the context of the weakly-constrained 4D-Var data assimilation p…
▽ More
The effect of preconditioning linear weighted least-squares using an approximation of the model matrix is analyzed, showing the interplay of the eigenstructures of both the model and weighting matrices. A small example is given illustrating the resulting potential inefficiency of such preconditioners. Consequences of these results in the context of the weakly-constrained 4D-Var data assimilation problem are finally discussed.
△ Less
Submitted 26 September, 2017;
originally announced September 2017.
-
On the use of the saddle formulation in weakly-constrained 4D-VAR data assimilation
Authors:
S. Gratton,
S. Gürol,
E. Simon,
Ph. L. Toint
Abstract:
This paper discusses the practical use of the saddle variational formulation for the weakly-constrained 4D-VAR method in data assimilation. It is shown that the method, in its original form, may produce erratic results or diverge because of the inherent lack of monotonicity of the produced objective function values. Convergent, variationaly coherent variants of the algorithm are then proposed whos…
▽ More
This paper discusses the practical use of the saddle variational formulation for the weakly-constrained 4D-VAR method in data assimilation. It is shown that the method, in its original form, may produce erratic results or diverge because of the inherent lack of monotonicity of the produced objective function values. Convergent, variationaly coherent variants of the algorithm are then proposed whose practical performance is compared to that of other formulations. This comparison is conducted on two data assimilation instances (Burgers equation and the Quasi-Geostrophic model), using two different assumptions on parallel computing environment. Because these variants essentially retain the parallelization advantages of the original proposal, they often --- but not always --- perform best, even for moderate numbers of computing processes.
△ Less
Submitted 19 September, 2017;
originally announced September 2017.