-
Inference in epidemiological agent-based models using ensemble-based data assimilation
Authors:
Tadeo Javier Cocucci,
Manuel Pulido,
Juan Aparicio,
Juan Ruiz,
Ignacio Simoy,
Santiago Rosa
Abstract:
To represent the complex individual interactions in the dynamics of disease spread informed by data, the coupling of an epidemiological agent-based model with the ensemble Kalman filter is proposed. The statistical inference of the propagation of a disease by means of ensemble-based data assimilation systems has been studied in previous work. The models used are mostly compartmental models represe…
▽ More
To represent the complex individual interactions in the dynamics of disease spread informed by data, the coupling of an epidemiological agent-based model with the ensemble Kalman filter is proposed. The statistical inference of the propagation of a disease by means of ensemble-based data assimilation systems has been studied in previous work. The models used are mostly compartmental models representing the mean field evolution through ordinary differential equations. These techniques allow to monitor the propagation of the infections from data and to estimate several parameters of epidemiological interest. However, there are many important features which are based on the individual interactions that cannot be represented in the mean field equations, such as social network and bubbles, contact tracing, isolating individuals in risk, and social network-based distancing strategies. Agent-based models can describe contact networks at an individual level, including demographic attributes such as age, neighbourhood, household, workplaces, schools, entertainment places, among others. Nevertheless, these models have several unknown parameters which are thus difficult to estimate. In this work, we propose the use of ensemble-based data assimilation techniques to calibrate an agent-based model using daily epidemiological data. This raises the challenge of having to adapt the agent populations to incorporate the information provided by the coarse-grained data. To do this, two stochastic strategies to correct the model predictions are developed. The ensemble Kalman filter with perturbed observations is used for the joint estimation of the state and some key epidemiological parameters. We conduct experiments with an agent based-model designed for COVID-19 and assess the proposed methodology on synthetic data and on COVID-19 daily reports from Ciudad Autónoma de Buenos Aires, Argentina.
△ Less
Submitted 29 October, 2021;
originally announced November 2021.
-
A Framework for Causal Discovery in non-intervenable systems
Authors:
Peter Jan van Leeuwen,
Michael DeCaria,
Nachiketa Chakaborty,
Manuel Pulido
Abstract:
Many frameworks exist to infer cause and effect relations in complex nonlinear systems but a complete theory is lacking. A new framework is presented that is fully nonlinear, provides a complete information theoretic disentanglement of causal processes, allows for nonlinear interactions between causes, identifies the causal strength of missing or unknown processes, and can analyze systems that can…
▽ More
Many frameworks exist to infer cause and effect relations in complex nonlinear systems but a complete theory is lacking. A new framework is presented that is fully nonlinear, provides a complete information theoretic disentanglement of causal processes, allows for nonlinear interactions between causes, identifies the causal strength of missing or unknown processes, and can analyze systems that cannot be represented on Directed Acyclic Graphs. The basic building blocks are information theoretic measures such as (conditional) mutual information and a new concept called certainty that monotonically increases with the information available about the target process. The framework is presented in detail and compared with other existing frameworks, and the treatment of confounders is discussed. While there are systems with structures that the framework cannot disentangle, it is argued that any causal framework that is based on integrated quantities will miss out potentially important information of the underlying probability density functions. The framework is tested on several highly simplified stochastic processes to demonstrate how blocking and gateways are handled, and on the chaotic Lorentz 1963 system. We show that the framework provides information on the local dynamics, but also reveals information on the larger scale structure of the underlying attractor. Furthermore, by applying it to real observations related to the El-Nino-Southern-Oscillation system we demonstrate its power and advantage over other methodologies.
△ Less
Submitted 27 September, 2021; v1 submitted 5 October, 2020;
originally announced October 2020.
-
Model error covariance estimation in particle and ensemble Kalman filters using an online expectation-maximization algorithm
Authors:
Tadeo Javier Cocucci,
Manuel Pulido,
Magdalena Lucini,
Pierre Tandeo
Abstract:
The performance of ensemble-based data assimilation techniques that estimate the state of a dynamical system from partial observations depends crucially on the prescribed uncertainty of the model dynamics and of the observations. These are not usually known and have to be inferred. Many approaches have been proposed to tackle this problem, including fully Bayesian, likelihood maximization and inno…
▽ More
The performance of ensemble-based data assimilation techniques that estimate the state of a dynamical system from partial observations depends crucially on the prescribed uncertainty of the model dynamics and of the observations. These are not usually known and have to be inferred. Many approaches have been proposed to tackle this problem, including fully Bayesian, likelihood maximization and innovation-based techniques. This work focuses on maximization of the likelihood function via the expectation-maximization (EM) algorithm to infer the model error covariance combined with ensemble Kalman filters and particle filters to estimate the state. The classical application of the EM algorithm in a data assimilation context involves filtering and smoothing a fixed batch of observations in order to complete a single iteration. This is an inconvenience when using sequential filtering in high-dimensional applications. Motivated by this, an adaptation of the algorithm that can process observations and update the parameters on the fly, with some underlying simplifications, is presented. The proposed technique was evaluated and achieved good performance in experiments with the Lorenz-63 and the 40-variable Lorenz-96 dynamical systems designed to represent some common scenarios in data assimilation such as non-linearity, chaoticity and model misspecification.
△ Less
Submitted 4 March, 2020;
originally announced March 2020.
-
Model uncertainty estimation using the expectation maximization algorithm and a particle flow filter
Authors:
María Magdalena Lucini,
Peter Jan van Leeuwen,
Manuel Pulido
Abstract:
Model error covariances play a central role in the performance of data assimilation methods applied to nonlinear state-space models. However, these covariances are largely unknown in most of the applications. A misspecification of the model error covariance has a strong impact on the computation of the posterior probability density function, leading to unreliable estimations and even to a total fa…
▽ More
Model error covariances play a central role in the performance of data assimilation methods applied to nonlinear state-space models. However, these covariances are largely unknown in most of the applications. A misspecification of the model error covariance has a strong impact on the computation of the posterior probability density function, leading to unreliable estimations and even to a total failure of the assimilation procedure. In this work, we propose the combination of the Expectation-Maximization algorithm (EM) with an efficient particle filter to estimate the model error covariance, using a batch of observations. Based on the EM algorithm principles, the proposed method encompasses two stages: the expectation stage, in which a particle filter is used with the present estimate of the model error covariance as given to find the probability density function that maximizes the likelihood, followed by a maximization stage in which the expectation under the probability density function found in the expectation step is maximized as a function of the elements of the model error covariance. This novel algorithm here presented combines the EM with a fixed point algorithm and does not require a particle smoother to approximate the posterior densities. We demonstrate that the new method accurately and efficiently solves the linear model problem. Furthermore, for the chaotic nonlinear Lorenz-96 model the method is stable even for observation error covariance 10 times larger than the estimated model error covariance matrix, and also that it is successful in high-dimensional situations where the dimension of the estimated matrix is 1600.
△ Less
Submitted 4 November, 2019;
originally announced November 2019.
-
Kernel embedded nonlinear observational mappings in the variational mapping particle filter
Authors:
Manuel Pulido,
Peter Jan vanLeeuwen,
Derek J. Posselt
Abstract:
Recently, some works have suggested methods to combine variational probabilistic inference with Monte Carlo sampling. One promising approach is via local optimal transport. In this approach, a gradient steepest descent method based on local optimal transport principles is formulated to transform deterministically point samples from an intermediate density to a posterior density. The local mappings…
▽ More
Recently, some works have suggested methods to combine variational probabilistic inference with Monte Carlo sampling. One promising approach is via local optimal transport. In this approach, a gradient steepest descent method based on local optimal transport principles is formulated to transform deterministically point samples from an intermediate density to a posterior density. The local mappings that transform the intermediate densities are embedded in a reproducing kernel Hilbert space (RKHS). This variational mapping method requires the evaluation of the log-posterior density gradient and therefore the adjoint of the observational operator. In this work, we evaluate nonlinear observational mappings in the variational mapping method using two approximations that avoid the adjoint, an ensemble based approximation in which the gradient is approximated by the particle covariances in the state and observational spaces the so-called ensemble space and an RKHS approximation in which the observational mapping is embedded in an RKHS and the gradient is derived there. The approximations are evaluated for highly nonlinear observational operators and in a low-dimensional chaotic dynamical system. The RKHS approximation is shown to be highly successful and superior to the ensemble approximation.
△ Less
Submitted 29 January, 2019;
originally announced January 2019.
-
A Review of Innovation-Based Methods to Jointly Estimate Model and Observation Error Covariance Matrices in Ensemble Data Assimilation
Authors:
Pierre Tandeo,
Pierre Ailliot,
Marc Bocquet,
Alberto Carrassi,
Takemasa Miyoshi,
Manuel Pulido,
Yicun Zhen
Abstract:
Data assimilation combines forecasts from a numerical model with observations. Most of the current data assimilation algorithms consider the model and observation error terms as additive Gaussian noise, specified by their covariance matrices Q and R, respectively. These error covariances, and specifically their respective amplitudes, determine the weights given to the background (i.e., the model f…
▽ More
Data assimilation combines forecasts from a numerical model with observations. Most of the current data assimilation algorithms consider the model and observation error terms as additive Gaussian noise, specified by their covariance matrices Q and R, respectively. These error covariances, and specifically their respective amplitudes, determine the weights given to the background (i.e., the model forecasts) and to the observations in the solution of data assimilation algorithms (i.e., the analysis). Consequently, Q and R matrices significantly impact the accuracy of the analysis. This review aims to present and to discuss, with a unified framework, different methods to jointly estimate the Q and R matrices using ensemble-based data assimilation techniques. Most of the methodologies developed to date use the innovations, defined as differences between the observations and the projection of the forecasts onto the observation space. These methodologies are based on two main statistical criteria: (i) the method of moments, in which the theoretical and empirical moments of the innovations are assumed to be equal, and (ii) methods that use the likelihood of the observations, themselves contained in the innovations. The reviewed methods assume that innovations are Gaussian random variables, although extension to other distributions is possible for likelihood-based methods. The methods also show some differences in terms of levels of complexity and applicability to high-dimensional systems. The conclusion of the review discusses the key challenges to further develop estimation methods for Q and R. These challenges include taking into account time-varying error covariances, using limited observational coverage, estimating additional deterministic error terms, or accounting for correlated noises.
△ Less
Submitted 19 May, 2020; v1 submitted 30 July, 2018;
originally announced July 2018.
-
Inference of stochastic parameterizations for model error treatment using nested ensemble Kalman filters
Authors:
Guillermo Scheffler,
Juan Ruiz,
Manuel Pulido
Abstract:
Stochastic parameterizations are increasingly being used to represent the uncertainty associated with model errors in ensemble forecasting and data assimilation. One of the challenges associated with the use of these parameterizations is the optimization of the properties of the stochastic forcings within their formulation. In this work a hierarchical data assimilation approach based on two nested…
▽ More
Stochastic parameterizations are increasingly being used to represent the uncertainty associated with model errors in ensemble forecasting and data assimilation. One of the challenges associated with the use of these parameterizations is the optimization of the properties of the stochastic forcings within their formulation. In this work a hierarchical data assimilation approach based on two nested ensemble Kalman filters is proposed for inferring parameters associated with a stochastic parameterization. The proposed technique is based on the Rao-Blackwellization of the parameter estimation problem. The technique consists in using an ensemble of ensemble Kalman filters, each of them using a different set of stochastic parameter values. We show the ability of the technique to infer parameters related to the covariance structure of stochastic representations of model error in the Lorenz-96 dynamical system. The evaluation is conducted with stochastic twin experiments and imperfect model experiments with unresolved physics in the forecast model. The proposed technique performs successfully under different model error covariance structures. The technique is proposed to be applied offline as part of an a priori optimization of the data assimilation system and could in principle be extended to the estimation of other hyperparameters of a data assimilation system.
△ Less
Submitted 27 July, 2018;
originally announced July 2018.
-
Kernel embedding of maps for sequential Bayesian inference: The variational mapping particle filter
Authors:
Manuel Pulido,
Peter Jan vanLeeuwen
Abstract:
In this work, a novel sequential Monte Carlo filter is introduced which aims at efficient sampling of high-dimensional state spaces with a limited number of particles. Particles are pushed forward from the prior to the posterior density using a sequence of mappings that minimizes the Kullback-Leibler divergence between the posterior and the sequence of intermediate densities. The sequence of mappi…
▽ More
In this work, a novel sequential Monte Carlo filter is introduced which aims at efficient sampling of high-dimensional state spaces with a limited number of particles. Particles are pushed forward from the prior to the posterior density using a sequence of mappings that minimizes the Kullback-Leibler divergence between the posterior and the sequence of intermediate densities. The sequence of mappings represents a gradient flow. A key ingredient of the mappings is that they are embedded in a reproducing kernel Hilbert space, which allows for a practical and efficient algorithm. The embedding provides a direct means to calculate the gradient of the Kullback-Leibler divergence leading to quick convergence using well-known gradient-based stochastic optimization algorithms. Evaluation of the method is conducted in the chaotic Lorenz-63 system, the Lorenz-96 system, which is a coarse prototype of atmospheric dynamics, and an epidemic model that describes cholera dynamics. No resampling is required in the mapping particle filter even for long recursive sequences. The number of effective particles remains close to the total number of particles in all the experiments.
△ Less
Submitted 29 May, 2018;
originally announced May 2018.
-
DADA: Data Assimilation for the Detection and Attribution of Weather- and Climate-related Events
Authors:
Alexis Hannart,
Alberto Carrassi,
Marc Bocquet,
Michael Ghil,
Philippe Naveau,
Manuel Pulido,
Juan Ruiz,
Pierre Tandeo
Abstract:
We describe a new approach allowing for systematic causal attribution of weather and climate-related events, in near-real time. The method is purposely designed to facilitate its implementation at meteorological centers by relying on data treatments that are routinely performed when numerically forecasting the weather. Namely, we show that causal attribution can be obtained as a by-product of so-c…
▽ More
We describe a new approach allowing for systematic causal attribution of weather and climate-related events, in near-real time. The method is purposely designed to facilitate its implementation at meteorological centers by relying on data treatments that are routinely performed when numerically forecasting the weather. Namely, we show that causal attribution can be obtained as a by-product of so-called data assimilation procedures that are run on a daily basis to update the meteorological model with new atmospheric observations; hence, the proposed methodology can take advantage of the powerful computational and observational capacity of weather forecasting centers. We explain the theoretical rationale of this approach and sketch the most prominent features of a "data assimilation-based detection and attribution" (DADA) procedure. The proposal is illustrated in the context of the classical three-variable Lorenz model with additional forcing. Several theoretical and practical research questions that need to be addressed to make the proposal readily operational within weather forecasting centers are finally laid out.
△ Less
Submitted 17 March, 2015;
originally announced March 2015.